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ABSTRACT 

We  develop  and  estimate  optimal  age  replacement  policies  for  devices  whose  age 
is  measured  in  multiple  time  scales.  For  example,  the  age  of  a  jet  engine  can  be 
measured  in  chronological  time,  the  number  of  flight  hours,  and  the  number  of  landings. 
Under  a  single-scale  age  replacement  policy,  a  device  is  replaced  at  age  Tor  upon  failure, 
whichever  occurs  first.  We  show  that  a  natural  generalization  tok>2  scales  is  to  replace 
non-failed  devices  when  their  usage  path  crosses  the  boundary  of  a  ^-dimensional  region 
M,  where  M  is  a  lower  set  with  respect  to  the  matrix  partial  order.  For  lifetimes  measured 
in  two  scales,  we  consider  two  contexts.  In  the  first,  devices  age  along  linear  usage  paths. 
For  this  case,  we  generalize  the  single-scale  long-run  average  cost  and  estimate  optimal 
two-scale  policies.  We  show  these  policies  are  strongly  consistent  estimators  of  the  true 
optimal  policies  under  mild  conditions,  and  study  small-sample  behavior  using 
simulation.  For  the  second  context,  in  which  device  usage  paths  are  unknown,  we  use 
two-dimensional  renewal  theory  to  derive  the  long-run  average  cost  of  a  policy.  We  give 
examples  in  both  settings  and  note  that  these  ideas  generalize  to  more  than  two  scales. 
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I.  INTRODUCTION 


In  practice,  the  age  of  a  device  is  often  measured  in  more  than  one  time  scale.  For 
example,  automobiles  age  in  the  “parallel”  scales  of  calendar  time  since  purchase  and 
number  of  miles  driven.  As  such,  routine  engine  maintenance  depends  on  both  of  these 
scales:  an  oil  change  is  recommended  every  three  months  or  3,000  miles,  whichever 
comes  first.  For  some  devices,  the  scale  most  relevant  for  maintenance  is  clear.  For 
example,  Kordonsky  and  Gertsbakh  (1993)  note  that  for  a  jet  engine  turbine,  the  duration 
of  the  warmup  period  is  the  most  relevant  (of  several  possible  scales)  but  for  the 
undercarriage  of  an  aircraft,  the  number  of  landings  is  most  relevant.  For  other  devices, 
however,  the  most  relevant  scale  for  maintenance  is  difficult  to  determine.  For  example, 
the  joint  between  an  aircraft  wing  and  the  fuselage  is  subjected  simultaneously  to 
corrosion  (thus  the  scale  “calendar  time”  is  relevant),  landing  stresses  (thus  number  of 
landings  is  relevant),  and  level  flight  stresses  (thus  total  flight  time  is  relevant),  as  noted 
by  Kordonsky  and  Gertsbakh  (1993).  In  any  case,  a  maintenance  policy  should  take  into 
account  the  parallel  scales  in  which  an  item  operates.  In  a  military  setting,  attempts  are 
made  to  model  the  effect  of  chronological  or  operational  time  on  the  failure 
characteristics  of  a  military  device  during  the  developmental  testing  phase.  During  this 
phase,  however,  it  may  be  difficult  or  impossible  to  accurately  model  the  effect  of  usage 
on  the  device  resulting  from  military  missions.  Thus,  classical  failure  models  are  used  to 
develop  single-scale  maintenance  policies,  even  though  it  is  well  known  that  the  device 
will  operate  in  the  parallel  scales  of  chronological  (or  operational)  time  and  number  of 
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missions.  Lifetime  data  including  the  total  number  of  missions  (e.g.,  landings)  accrued  at 
the  time  of  device  failure  may  become  available  later  in  the  acquisition  cycle,  such  as 
during  operational  testing  or  upon  initial  fielding.  Military  maintenance  costs  should  be 
reduced  by  using  policies  that  directly  account  for  aging  in  multiple  scales.  In  this 
dissertation  we  focus  on  developing,  optimizing,  and  estimating  maintenance  policies,  in 
particular  age  replacement  policies,  based  on  multiple  time  scales. 

A.  SINGLE-SCALE  AGE  REPLACEMENT  POLICIES 

The  vast  majority  of  methods  for  developing  maintenance  policies  are  based  on  a 
single  time  scale;  see  McCall  (1965),  Pierskalla  and  Voelker  (1976),  and  Valdez-Flores 
and  Feldman  (1989)  for  comprehensive  reviews.  Among  the  most  useful  and  most 
studied  are  age  replacement  policies,  under  which  a  device  is  replaced  (or  overhauled)  at 
failure  or  at  a  predetermined  age  r>  0,  whichever  occurs  first.  Let  X  be  a  positive  random 
variable  (r.v.)  representing  the  lifetime  of  a  device,  i.e.,  the  time  when  the  device  fails. 

Let  X  have  distribution  function  F;  following  Bather  (1977)  it  will  be  convenient  to 
define  F(x)  =  P(X  <  x)  and  the  survivor  function  as  S(x)  =  P(X  >  x).  Thus,  under  an  age 
replacement  policy,  a  device  is  replaced  with  a  new  one  at  time  min{X,T}.  Let  the  cost 
for  replacement  be  K  >  0  if  the  device  is  replaced  due  to  age  (i.e.,  preventively,  since 
X  >  r)  and  K  +  C  if  it  is  replaced  due  to  failure  (i.e.,  X  <  r),  where  the  additional  cost  of 
replacement  at  failure  is  C  >  0.  If  devices  have  independent  lifetimes,  then  replacement 
times  occur  according  to  a  renewal  process.  From  the  Renewal  Reward  Theorem  (e.g., 
Ross,  1997),  the  long-run  average  cost  per  unit  of  time  that  the  device  is  in  use  is 
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(1.1) 


C(T)  = 


K  +  CFj  t) 

\TQS{u)du 


T>  0. 


A  complete  derivation  of  this  expression  can  be  found  in  Appendix  A.  If  F  is  absolutely 
continuous  and  has  an  increasing  failure  rate  (IFR).  then  C(t)  has  at  most  one  minimum. 
In  addition,  if  the  failure  rate  is  continuous  and  strictly  increasing  to  there  exists  a 
unique  and  finite  value  T*  minimizing  C(r)  (e.g.,  Barlow  &  Proschan,  1965).  Bergman 
(1982)  shows  that  a  unique,  finite  z*  is  attained  under  slightly  less  restrictive  conditions. 

When  F  is  completely  specified,  z*  can  be  found  explicitly,  but  is  more  often 
found  with  numerical  methods.  Glasser  (1967)  uses  numerical  methods  to  obtain  charts 
which  can  be  used  to  find  z*  when  F  is  truncated  normal,  gamma,  or  Weibull.  When  F  is 
unknown,  there  are  numerous  approaches  available  for  estimating  z*  based  upon  lifetime 
data.  In  most  of  these  approaches,  F  in  equation  (1.1)  is  replaced  with  an  estimator  F 
based  upon  the  data.  This  results  in  an  estimator  C(r)of  the  cost  function  C(r);  z*  is 

then  estimated  by  minimizing  C(r) .  For  example,  given  a  simple  random  sample 

Xn,  of  lifetimes  from  F,  non-parametric  estimators  of  C(r)  and  z*  can  be  found 
using  the  empirical  survivor  function 


S<r)  =  X;„/(X,>r)/n, 


(1.2) 


where  I(X  .  >  t)  - 1  if  X  -  >  t  and  0  otherwise.  It  follows  that  the  estimator  of  C(f)  is 


C(T)  = 


(K  +  C)-CS{t) 
jTJ{u)du 


T>  0. 


(1.3) 
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From  the  definition  of  S( r) ,  it  is  seen  that  C(r)  is  lower  semi-continuous  with 
denominator  strictly  increasing  on  (0,°°)  and  numerator  a  lower  semi-continuous  step 
function  constant  between  observations.  As  a  result,  local  minima  of  C( t)  are  found  at 
the  observations  and  we  define  f  =  argmin  C(X;  ) .  Also,  f  is  not  necessarily  unique. 
Arunkumar  (1972)  proves  thatC(r)  and  fare  strongly  consistent  estimators  of  C(r)  and 
**,  respectively.  Ingram  and  Scheaffer  (1976)  address  estimation  using  the  non- 
parametric  maximum  likelihood  estimator  (MLE)  of  F  under  the  restriction  of  F  having 
an  increasing  failure  rate.  The  optimal  policy  z*  can  also  be  estimated  under  other 
sampling  schemes;  for  example,  Kumar  and  Westberg  (1997)  estimate  z*  under  right- 
censoring,  and  Bather  (1977),  Frees  and  Ruppert  (1985),  and  Aras  and  Whitaker  (1992) 
address  sequential  estimation  of  z*.  Graphical  approaches  can  also  be  used  to  minimize 
(1.1)  and  (1.3).  Bergman  (1977)  uses  the  total  time  on  test  (TTT)  plotting  method  of 
Barlow  and  Campo  (1975)  to  estimate  z*.  This  method  is  insightful  since  one  can  deduce 
ranges  of  the  ratio  K/C  for  which  a  particular  t*  is  optimal.  Two  comprehensive 
treatments  of  this  approach  are  contained  in  Bergman  and  Klefsjo  (1982)  and  Klefsjo 
(1986). 

B.  FAILURE  MODELING  IN  MULTIPLE  TIME  SCALES 

Extending  this  theory  so  it  can  be  used  for  maintenance  of  a  device  whose  age  is 
measured  in  multiple  scales  requires  more  than  generalizing  a  univariate  lifetime  X  to  a 
multivariate  lifetime,  say  (X,Y).  This  is  not  always  immediately  apparent.  Confusion 


4 


arises  because  data  used  to  estimate  multiple-scale  policies  often  appear  to  be  of  the  form 
(XUY\),  (X2,Yi),  ■  ,  C Xn,Yn. )•  Nonetheless,  the  actual  implementation  of  an  age 

replacement  policy  requires  that  a  device  be  tracked  continuously  through  time.  Even  in 
a  single  scale,  a  policy  cannot  be  implemented  by  observing  the  age  at  failure;  the  device 
is  monitored  through  time  so  that  it  can  be  replaced  at  failure  or  time  T,  whichever  comes 
first.  The  implementation  of  such  a  policy  in  more  than  one  scale  requires  knowledge  of 
the  usage  path,  or  “history”  of  the  device;  this  notion  is  central  to  the  literature  of 
multiple  time  scales  (e.g.,  Duchesne  and  Lawless  (2000)).  Let  x  >  0  denote  the 
chronological  time  since  introduction  of  a  device  into  service,  and  let  y(x)  represent  usage 
accumulated  by  the  device  up  to  age  x  (e.g.,  the  total  number  of  miles  an  automobile  has 
been  driven  up  to  age  x).  The  usage  path  of  a  device  up  to  chronological  time  x  is 
defined  to  be  Z(x)  =  {(u,y(u)):  0  <«<*}.  In  addition,  if  the  random  variable  X  represents 
the  chronological  age  of  the  device  at  failure  and  Y  =  y(X),  then  (X,Y)  represents  the  time 
and  cumulative  usage  at  failure.  In  some  cases  a  vector  y(x)  of  various  measures  of  usage 
is  available  (e.g.,  yi(x)  could  be  the  number  of  flight  hours  accrued  as  of  chronological 
time  x,  and  y2(x)  could  be  the  number  of  landings  accrued  as  of  chronological  time  x, 
etc.).  Then,  the  usage  path  is  Z(x)  =  {(uy(u)):  0  <  w  <  x}.  In  most  of  what  follows, 
however,  we  assume  only  a  single  measure  of  usage  is  available  in  order  to  simplify  the 
presentation.  Typically,  a  measure  of  usage  is  required  to  be  both  non-decreasing  in  x 
and  an  external  covariate.  The  latter  requirement  (see  Kalbfleisch  and  Prentice,  1980, 
Section  5.3)  ensures  the  usage  path  Z  is  determined  independently  of  the  time  to  failure 

X. 
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Modeling  the  lifetime  of  a  device  whose  failure  depends  upon  the  parallel  effects 
of  time  and  usage  has  received  a  great  deal  of  attention  in  the  past  decade.  Three  main 
approaches  are  found  in  the  literature.  The  first  approach  is  to  use  a  conditional  model. 
Lawless  et  al  ( 1 995)  model  automobile  warranty  data  by  considering  separately  the 
distribution  of  X  along  each  path  Z  and  the  distribution  of  the  paths.  The  second  is  to  use 
a  joint  model  for  failure  times.  This  approach  is  taken  by  Singpurwalla  and  Wilson 
(1998),  Murthy  et  al  (1995),  and  Kordonsky  and  Gertsbakh  (1994).  Models  built  using 
this  approach  do  not  rely  explicitly  on  the  notion  of  a  usage  path.  Due  to  the  inherent 
complexity  of  explicitly  modeling  lifetimes  and  paths  in  multiple  scales,  much  of  the 
recent  work  in  this  area  focuses  on  a  third  approach,  that  of  finding  appropriate  methods 
for  combining  scales  to  form  a  single  scale.  When  such  a  combined  scale  can  be  found, 
standard  univariate  reliability  tools  (including  age  replacement  theory)  can  be  brought  to 
bear.  Duchesne  and  Lawless  (2000)  unify  and  formalize  all  previous  work  in  combining 
scales. 

C.  MAINTENANCE  IN  MULTIPLE  TIME  SCALES 

Much  less  attention  has  been  given  to  maintenance  policies  based  on  multiple 
scales.  In  the  earliest  work  in  this  area,  Nakagawa  (1985)  derives  policies  for  devices 
that  fail  by  either  age  or  usage.  He  derives  the  expected  cost  rate  C(rJV)  of  the  policy 
under  which  a  device  is  replaced  at  failure,  at  chronological  age  T,  or  at  a  discrete  number 
N  uses,  whichever  occurs  first.  In  our  setting,  however,  it  is  rarely  evident  whether 
failure  occurred  due  to  age  or  usage.  In  addition,  since  it  is  common  to  have  both  age  and 
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usage  continuous  (e.g.,  scales  might  be  chronological  time  since  production  and  total 
flight  time),  we  need  models  that  allow  usage  to  be  continuous  as  well  as  discrete. 

Unlike  Nakagawa  (1985),  most  recent  work  focuses  on  finding  an  appropriate  combined 
scale  to  be  used  for  preventive  maintenance.  With  this  approach,  the  cost  of  age 
replacement  can  then  be  computed  in  the  combined  scale,  and,  under  appropriate 
conditions,  an  optimal  replacement  age  can  also  be  found  in  that  scale.  The  major  work 
in  this  area  is  done  by  Kordonsky  and  Gertsbakh  (1994)  and  along  slightly  different  lines 
by  Kordonsky  and  Gertsbakh  (1993,  1995,  1997).  They  restrict  attention  to  linear 
combined  scales  t(a)  =  (1  -a)x  +  ay(x),  where  a  e  [0,1].  Under  an  age  replacement  policy 
in  such  a  scale,  a  device  is  replaced  at  age  T(in  the  combined  scale)  or  upon  failure  at  age 
T(a)  =  (1  -a)X  +  aY,  whichever  occurs  first.  Most  recently,  Duchesne  and  Lawless  (2000) 
propose  an  “ideal”  time  scale  which  generalizes  some  of  the  work  of  Kordonsky  and 
Gertsbakh.  Although  not  motivated  specifically  with  preventive  maintenance  in  mind, 
they  suggest  that  their  scale  might  be  used  for  such  purposes.  The  ideal  scale  is 
developed  in  order  to  capture  chronological  age  and  usage  in  such  a  way  that,  under 
appropriate  conditions,  the  lifetime  distribution  of  a  device  in  this  scale  is  independent  of 
the  path.  Thus,  in  principle,  an  age  replacement  policy  based  on  an  ideal  time  scale  could 

be  used  for  devices  regardless  of  their  usage  path. 

Because  using  combined  scales  reduces  the  problem  of  maintenance  in  multiple 
scales  to  that  of  maintenance  in  a  single  scale,  it  has  the  advantage  of  being  tractable  and 
easily  understood.  Combined  scales,  however,  do  not  completely  address  the  problems 
of  maintenance  in  multiple  scales.  Absent  from  the  literature  is  discussion  of  the 
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translation  of  policies  developed  in  combined  scales  to  policies  in  the  original  scales. 
Upon  performing  such  a  translation,  it  is  clear  that  policies  based  on  linear  scales 
correspond  to  replacing  devices  if  their  joint  failure  time  (X,Y)  falls  in  the  region 
XI  =  {(x,j(x)):  ( 1  -a)x  +  ay{x)<  t)  or  when  their  usage  curve  crosses  the  boundary  of  this 
region,  whichever  occurs  first.  Similarly,  policies  based  on  an  ideal  time  scale 
correspond  to  regions  in  the  positive  quadrant  whose  upper  boundaries  follow  the 
contours  of  the  ideal  time  scale.  Considering  such  regions  in  the  original  scales  suggests 
a  more  general  class  of  policies  that  should  be  considered  when  searching  for  the  optimal 
policy.  Also  absent  from  the  literature  are  methods  for  comparing  the  cost  of  policies 
based  on  combined  scales  of  different  forms.  The  approach  of  Kordonsky  and  Gertsbakh 
(1994)  does  provide  a  means  for  comparing  costs  in  the  special  case  of  the  family  of 
linear  scales.  As  such,  the  need  arises  for  a  means  to  compare  the  cost  of  policies  from  a 
larger  class  of  alternatives. 

In  this  dissertation  we  directly  attack  the  problem  of  estimating  optimal  age 
replacement  policies  for  devices  with  age  measured  in  multiple  scales  in  two  different 
settings.  In  both,  our  focus  is  to  search  over  a  large  class  of  sensible  policies  to  minimize 
estimated  long-run  costs.  To  do  so,  we  first  define  a  class  of  multiple-scale  policies 
which  generalize  policies  found  in  previous  works.  In  Chapter  II,  we  use  several  real 
data  sets  to  help  develop  insight  into  our  choice  of  this  class  of  potential  policies. 

Because  this  class  of  policies  is  related  to  policies  produced  by  combined  scales,  in 
Chapter  III  we  review  and  discuss  in  detail  how  multiple-scale  policies  are  obtained  using 
the  scale-combining  approaches  found  in  the  literature.  In  this  chapter  we  also  discuss 
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how  these  policies  fit  into  the  framework  established  in  Chapter  II.  In  so  doing,  we  raise 
significant  concerns  that  reveal  the  need  for  new  methods.  Since  usage  paths  are  often 
well-approximated  by  straight  lines,  in  Chapter  IV  we  develop  estimators  of  the  cost 
function  and  optimal  policy  for  the  case  in  which  devices  age  along  linear  usage  paths. 

In  Chapter  V  we  discuss  the  large-  and  small-sample  properties  of  these  estimators  and 
compare  their  performances  with  policies  based  on  a  common  scale-combining  approach 
In  Chapter  VI  we  develop  a  cost  function  for  policies  under  a  joint  model  for  ( X,Y)  and 
present  numerical  results  obtained  from  solving  the  corresponding  optimization  problem 
for  rectangular-shaped  policies.  In  Chapter  VII  we  highlight  our  contributions  and 
present  opportunities  for  further  research. 
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II.  EXTENDING  AGE  REPLACEMENT  THEORY  TO  MULTIPLE  TIME  SCALES 

We  seek  to  generalize  the  classical  age  replacement  policy,  under  which  a  device 
is  replaced  at  age  Tor  failure  (whichever  occurs  first),  to  a  policy  based  on  age  measured 
in  multiple  scales.  The  cost  function  used  to  define  an  optimal  policy  is  based  on  the 
mechanism  generating  the  failures.  However,  the  general  form  of  a  sensible  multiple- 
scale  age  replacement  policy  applies  equally  to  many  failure  models.  In  this  chapter,  we 
introduce  three  data  sets  to  help  develop  insight  into  an  appropriate  form  for  a  multiple- 
scale  age  replacement  policy.  The  data  sets  are  chosen  to  represent  situations  for  which 
either  the  conditional  modeling  approach  or  the  joint  modeling  approach  may  be 
appropriate.  In  the  first  and  third  data  sets,  it  is  apparent  that  failures  occur  along  fixed 
linear  usage  paths.  In  such  a  situation,  an  appropriate  model  is  one  which  generates 
failures  conditioned  on  the  usage  path  and  then  utilizes  a  mixing  distribution  over  the 
paths.  However,  in  the  second  data  set,  there  are  no  clear  usage  paths  and  the  data  are 
better  modeled  by  a  joint  distribution.  After  considering  the  three  data  sets,  we 
generalize  the  form  of  an  age  replacement  policy  to  incorporate  multiple  time  scales. 


A.  INTRODUCTORY  CASE  STUDIES 

Under  a  single-scale  policy  with  replacement  time  T,  a  device  is  replaced  if  it  fails 

in  the  interval  (0,r)  or  if  its  time  in  use  (the  one-dimensional  equivalent  of  a  usage  path) 
crosses  the  right-most  boundary  of  (0,r).  As  we  generalize  to  the  case  of  multiple  scales, 
it  will  be  convenient  to  identify  a  policy  by  the  multiple-scale  equivalent  of  the  failure 
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replacement  interval  (0,t).  This  leads  to  consideration  of  policies  defined  by  regions  M. 
Here,  a  device  is  replaced  if  ( X,Y)  is  in  M  (i.e.,  upon  failure)  or  when  its  usage  path 
crosses  the  boundary  of  M,  whichever  occurs  first.  For  now,  we  consider  how  such 
policies  might  be  constructed  based  on  observed  bivariate  failure  times  fa, y, ),...,  (xn,yn). 
In  what  follows  we  use  the  notation  Rfay)  to  denote  the  rectangle  (0,x)  x  (0 ,y). 

Case  Study  1 

Consider  policy  Mx  =  R(  f  ,°°),  where  f  minimizes  the  empirical  cost  function 
(1.3)  based  on  the  first  components  X\ , ...  ,  xn.  Under  this  policy,  we  replace  the  device 
when  its  age  reaches  f  or  fails,  whichever  occurs  first,  regardless  of  the  usage  accrued. 
Although  constructed  in  a  rather  naive  manner,  such  a  policy  may  be  adequate  in  some 
cases. 

For  example,  consider  the  locomotive  traction  motor  failure  data  in  Singpurwalla 
and  Wilson  (1998).  The  data  (see  Appendix  B)  consists  of  the  time  since  inception  of 
service  and  mileage  at  failure  of  forty  locomotive  traction  motors.  Figure  2.1  shows  a 
scatterplot  of  the  failure  data  in  the  time  scales  number  of  days  and  number  of  miles  and 
the  regression  fit  through  the  origin.  The  coefficient  of  determination  exceeds  99%.  For 
these  data,  knowing  the  number  of  days  at  failure  is  almost  equivalent  to  knowing  the 
number  of  miles  at  failure  since  all  exemplars  have  virtually  identical  usage  rates  (i.e., 
number  of  miles  per  day).  Hence  a  “naive”  policy  based  solely  on  chronological  age 
suffices.  Similarly,  we  could  consider  a  mileage-based  policy  My  =  R(oo,  y)  where  v 


minimizes  (1.3)  based  on  yu  ... ,  y«.  In  fact,  for  ratios  KJC  >  0.25,  the  two  regions 
Mx  =  R(°°,1200)  and  My  =  R(57304,<^)  are  based  on  the  same  observation,  namely 
(1200,57304). 


Figure  2.1 :  Traction  Motor  Data  with  Regression  Line. 
Triangles  represent  the  number  of  days  and  miles  until  a  failure  occurred. 


Case  Study  2 

A  policy  based  on  a  single  scale  may  not  be  satisfactory  for  lifetime  data  arising 
from  devices  having  differing  usage  paths.  Figure  2.2  shows  a  scatterplot  of  failure  times 
of  jet  engines,  discussed  by  Gertsbakh  and  Kordonsky  (1998).  This  data  set  (see 
Appendix  B)  contains  the  flight  hours  and  number  of  landings  at  failure  of  21  Aeroflot  jet 
engines.  Unlike  the  first  data  set,  the  failures  have  occurred  along  several  usage  paths, 
and  these  paths  are  not  provided  or  evident  from  Figure  2.2.  Thus,  knowing  the  number 
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of  hours  at  failure  is  not  equivalent  to  knowing  the  number  of  landings  at  failure.  Hence, 
a  policy  based  on  only  the  flight  hours  at  failure  or  only  the  number  of  landings  at  failure 
is  likely  to  ignore  information  that  could  potentially  reduce  maintenance  costs.  In  fact, 
for  K/C  =  0.5,  Mx  =  R(4932,°°)  and  My  =  R(°°,  1152);  these  two  policies  (with  boundaries 
delimited  by  the  overlaid  dashed  lines  in  Figure  2.2)  are  based  on  the  vastly  “different” 
observations  (4932,  1960)  and  (3227,1152),  respectively. 
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Figure  2.2:  Jet  Engine  Data. 

Triangles  represent  the  number  of  flight  hours  and  landings  until  a  failure 
occurred.  The  vertical  and  horizontal  lines  represent  the  boundaries  of, 
respectively,  a  policy  triggered  solely  by  the  number  of  flight  hours  at 
failure  and  the  policy  triggered  solely  by  the  number  of  landings  at  failure. 


Such  policies,  however,  are  often  used.  Gertsbakh  and  Kordonsky  (1997)  note 
that  a  single  distribution  is  often  fit  to  lifetime  data  arising  from  devices  operating  in 
heterogeneous  environments.  An  “optimal”  policy  is  estimated  from  this  distribution  and 
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applied  to  the  entire  population.  Policies  of  this  form  ignore  the  bivariate  nature  of  the 
failure  data.  For  example,  under  policy  Mx,  devices  with  lifetimes  (x,y)  and  (x,2 y)  are 
treated  in  the  same  manner,  even  though  the  latter  device  is  “older”  in  some  sense  than 
the  former.  A  policy  which  somehow  incorporates  the  additional  information  contained 
in  the  paired  failure  times  seems  “better”  than  Mx-  Consider  the  policy  Mxy  =  R(  f ,  v ), 
formed  by  combining  Mx  and  My.  Under  policy  Mxy  we  replace  a  non-failed  device 
when  it  accrues  either  age  f  or  usage  v ,  whichever  occurs  first;  f  and  v  are  estimates  of 
the  optimal  replacement  times  in  the  two  single-scale  age  replacement  problems.  Policy 
Mxy  seems  to  be  an  improvement  over  both  Mx  and  My,  since  it  is  based  on  all  the  data 
and  since  in  some  cases  (for  example)  devices  with  lifetimes  (x,y)  and  (x,2 y)  are  treated 
differently.  Nevertheless,  the  separate  computation  of  f  and  v  ignores  the  dependency 
between  the  failure  times  in  the  two  scales.  Policy  Mxy  is  based  only  on  estimates  of  the 
marginal  distributions  of  failure  time  in  the  two  scales,  and  thus  does  not  fully  account 
for  the  joint  effect  of  age  and  usage  on  failure.  A  bivariate  policy  should  somehow 
account  for  this  dependence.  Kordonsky  and  Gertsbakh  (1995)  explain,  “Each  particular 
time  scale  reflects  indirectly  a  most  relevant  process  of  damage  accumulation,  but  fails  to 
reflect  the  joint,  interactive  action  of  these  processes.  For  an  aircraft ...  ‘time  in  the  air’ 
and  ‘number  of  flights’  both  reflect  fatigue  damage  accumulation,  but  each  scale 
separately  is  not  able  to  reflect  ‘total’  fatigue  damage.”  In  Chapter  VI,  we  develop  a  cost 
function  that  can  be  used  to  find  the  “best”  policy  of  the  form  Mxy- 
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Case  Study  3 


Consider  failures  due  to  metal  fatigue,  (see  Appendix  B)  discussed  in  Kordonsky 
and  Gertsbakh  (1993).  The  metal  fatigue  data  plotted  in  Figure  2.3  consists  of  30 
observations,  five  on  each  of  six  distinct  paths.  Specimens  on  a  particular  path  are 
subjected  to  bending  through  a  repetitive  pattern  of  a  fixed  number  of  small-amplitude 
(low-load)  cycles  followed  by  a  fixed  number  of  large-amplitude  cycles  (high-load)  until 
failure.  In  Figure  2.3,  the  scale  along  the  horizontal  axis  is  the  number  of  low-load  cycles 
and  the  scale  along  the  vertical  axis  is  the  number  of  high-load  cycles.  By  design,  the 
observations  fall  almost  perfectly  on  lines  of  slopes  6\  =  0.053,  &i  =  0.250,  &i  -  0.667, 

04  =1.5,  05  =  4,  and  06  =  19.  The  dashed  lines  in  Figure  2.3  represent  these  approximate 
linear  usage  paths. 


0  10000  20000  30000  40000 

low/10 


Figure  2.3:  Metal  Data  with  Approximate  Linear  Usage  Paths. 

Each  triangle  represents  the  number  of  low-load  and  high-load  cycles  until 
a  failure  occurred,  scaled  by  a  factor  of  1/10. 


In  data  sets  of  this  form,  each  device  ages  along  a  linear  path  of  slope  #, 
i  =  1, ... ,  m,  where  0  <  6\  <  &i  <  ...  <  9m  <  «.  As  such,  the  data  set  can  be  naturally 
partitioned  into  m  samples,  each  consisting  of  failure  data  along  a  linear  path.  As  with 
the  traction  motors,  a  policy  can  be  specified  for  devices  along  a  given  usage  path  solely 
in  terms  of  chronological  time,  since  at  any  time  x  >  0  the  position  of  a  device  along  its 
usage  path  is  known.  To  construct  such  a  policy,  consider  the  sample  along  each  usage 
path  separately.  That  is,  use  the  n,  chronological  ages  at  failure  along  the  ith  path  to 
estimate  F„  the  conditional  lifetime  distribution  of  X  \6=  6i.  Then,  use  the  empirical  cost 
function  (1.3)  to  estimate  the  optimal  age  replacement  policy  r,  (which  applies  only  to 
devices  on  the  i,h  path).  The  resulting  policy,  with  replacement  times  summarized  in 
vector  (t, , r2 takes  the  following  form:  replace  a  non-failed  device  on  path  i 

when  its  chronological  age  reaches  ri ,  i  =  1, ...  ,  m. 

For  the  metal  data,  suppose  each  F,  is  estimated  with  the  empirical  distribution, 
placing  mass  1/n,  =  0.2  on  each  observation  on  the  ith  path.  Upon  doing  so,  for  KIC  =  0.5 
we  obtain  the  following  estimates:  r,  =23580,  r2  =10300,  t3  =5700,  f4  =3200, 

?5  =1000,  r6  =  275.  Hence,  the  “composite”  policy  is  as  follows:  replace  non-failed 

devices  on  path  1  at  age  23580; ...  ;  replace  non-failed  devices  on  path  6  at  age  275.  The 
region  corresponding  to  this  policy  is  depicted  on  the  left  side  of  Figure  2.4. 

At  first  glance,  the  proposed  composite  policy  seems  reasonable;  however,  the 
implementation  of  the  policy  is  problematic.  Consider  two  devices,  say  A  and  B. 
Suppose  A  has  usage  path  5,  namely  {(x,4x),  *>0}  and  B  has  usage  path  4,  namely 
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{(jc,  1.5jc),  x>  0}.  Under  the  composite  policy,  if  device  A  is  still  operating  we  would 
replace  it  preventively  when  its  age  reaches  x  =  1000;  at  this  time,  it  has  usage 
y(X)  =  4000.  However,  if  device  B  is  still  operating  atx  =  3000,  we  would  not  replace  it; 
at  this  time,  its  usage  is  y(x)  =  4500.  The  metal  fatigue  experiment  was  designed  so  that 
the  accumulation  of  low-load  cycles  and  the  accumulation  of  high-load  cycles  are  the 
only  factors  leading  to  device  failure.  As  such,  this  composite  policy  does  not  seem 
sensible,  because  device  B  is  older  than  device  A  in  every  respect. 


Figure  2.4:  Composite  Policies  for  the  Metal  Data. 

The  solid  lines  on  left  side  of  the  figure  represent  the  failure  replacement 
region  for  the  policy  with  replacement  time  vector  (23580,  10300,  5700, 
3200,  1000,  275).  The  right  side  of  the  figure  depicts  the  failure 
replacement  region  for  the  policy  with  replacement  time  vector  (10000, 
10300,  5700,  3200, 1200,  275). 


However,  this  is  not  the  only  problem  we  could  encounter  using  this  approach. 
Consider  the  same  data,  and  suppose  that  instead  of  the  policy  suggested  above,  we 
obtain  policy  (10000, 10300,  5700,  3200,  1200,  275).  The  region  corresponding  to  this 
policy  is  depicted  on  the  right  side  of  Figure  2.4.  Suppose  device  A  is  on  path  1  and 
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device  B  is  on  path  2.  Under  this  policy,  if  device  A  is  still  operating  at  age  x  =  10000, 
we  would  replace  it  preventively;  at  this  time  its  cumulative  usage  is  y(x )  =  526. 

However,  if  device  B  is  still  operating  at  age  x  =  10000,  we  would  not  replace  it  (as  it  has 
not  yet  reached  age  x  =  10300);  at  age  x  =  10000  its  cumulative  usage  is  y(x)  =  2500. 
Device  B  is  older  than  A  in  every  respect;  this  composite  policy  does  not  seem  sensible 
either.  We  now  investigate  the  notion  of  a  “sensible”  policy  in  more  detail. 

B.  DESCRIPTION  OF  A  CLASS  OF  MULTIPLE-SCALE  POLICIES 

In  this  section  we  describe  a  class  of  multiple-scale  policies  which  generalizes  the 
class  of  single-scale  policies  {(0,t):  t>  0}.  We  assume  devices  under  consideration  may 
differ  only  in  their  age  in  chronological  time  and  in  the  amount  of  usage  accumulated. 

As  such,  we  implicitly  assume  there  are  no  “hidden”  covariates  (e.g.,  better 
environmental  conditions  for  certain  devices,  or  additional  measures  of  usage)  affecting 
the  process  leading  to  eventual  device  failure.  One  example  of  a  policy  which 
generalizes  the  policy  (0 ,r)  is  M=  (0 ,u)  x  (0,v),  where  u  >  0  and  v  >  0,  as  considered  in 
Case  Study  2.  Under  this  policy,  a  device  is  replaced  if  it  fails  at  a  time  (X,Y)  where 
X  <  u  and  Y  <  v  or  when  its  usage  path  crosses  the  boundary  x  =  u  or  y  =  v,  whichever 
occurs  first.  Kordonsky  and  Gertsbakh  (1994)  devise  policies  based  on  lifetimes  in  two 
scales  by  projecting  failure  times  onto  a  single  time  scale  of  the  form  t  =  (l-a)x  +  ay{x), 
in  which  they  define  a  replacement  age  Ta.  This  policy  replaces  at  age  t  -  ta  or  upon 
failure,  whichever  occurs  first.  In  the  original  two  scales,  this  policy  corresponds  to  the 
region  M  =  { (x,y(x)):  ( 1  -a)x  +  ay(x)  <  Ta } .  In  fact,  for  a  =  0,  M = Mx ,  as  in  Case  1 ; 
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similarly,  for  a  =  1 ,  M  =  My,  when  0  <  a  <  1 ,  Mis  a  right  triangle  with  right  angle  at  the 
origin. 

On  the  other  hand,  consider  the  policy  M  depicted  in  Figure  2.5.  From  a 
preventive  maintenance  standpoint,  this  policy  is  not  sensible  since  the  device  with 
(x,y(x))  =  (50,25)  would  be  replaced  preventively,  but  a  non-failed  device  with 
(x,y(x))  =  (55,90)  would  not  be  replaced,  even  though  it  is  older  than  the  first  device  in 
both  time  scales.  In  order  to  be  sensible  under  the  assumptions  described  above,  a  policy 
prescribing  preventive  replacement  of  a  device  should  prescribe  preventive  replacement 
of  any  “older”  device.  On  the  other  hand,  if  a  policy  stipulates  that  a  device  should  not 
be  replaced  preventively,  then  any  “younger”  device  should  not  be  replaced  preventively 
either.  To  describe  this  more  formally,  we  need  a  means  of  ordering  two-dimensional 
failure  times. 
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Figure  2.5:  Undesirable  Policy. 

Under  this  policy,  for  example,  a  non-failed  aircraft  component  with  x=  50 
flight  hours  and  y{x)  =  25  landings  would  be  replaced,  but  one  with  x  =  55 
flight  hours  and  y(x)  =  90  landings  would  not  be  replaced. 


A  binary  relation  -!ona  set  Xis  a  simple  order  on  Xii  it  is  reflexive,  transitive, 
anti-symmetric,  and  the  members  of  every  pair  of  elements  of  Xare  comparable.  The 
relation  -<  is  a  partial  order  on  a  set  Xif  it  is  reflexive,  transitive,  and  anti-symmetric 
(thus,  simple  orders  are  partial  orders;  however,  for  partial  orders,  certain  elements  of  X 
may  be  non-comparable).  In  addition,  L  c  JTis  a  lower  set  with  respect  to  a  partial  order 
-<  if  u  e  L,  v  e  Xand  v  <u  imply  v  e  L  (e.g.,  Robertson,  Wright,  and  Dykstra,  1988);  a 
lower  set  contains  all  “predecessors”  of  each  of  its  members.  For  failure  times  u  =  ( U\,U2 ) 
and  v  =  (vj,  V2)  in  JTe  (0,c«)2,  we  take  -<  to  be  the  matrix  partial  order  where  u  ■<  v  if  and 
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only  if  «i  <  vi  and  u2  <  v2.  Note  that  JTmay  be  a  proper  subset  of  (0,°°)2,  as  in  Case  Study 
3,  where  all  failure  times  lie  along  one  of  six  linear  usage  paths. 

Using  these  definitions,  we  now  characterize  a  class  of  policies  for  the  multiple- 
scale  age  replacement  problem.  For  ease  of  exposition,  they  are  described  in  the  plane. 
Let  jT denote  the  support  of  (X,Y),  and  ^  denote  the  class  of  all  open  lower  sets  with 
respect  to  the  matrix  partial  order  on  X  Observe  that  for  X  =  (0,«>),  the  class  of  single¬ 
scale  policies  {(0,t):  t>  0}  is  the  class  of  open  lower  sets  with  respect  to  the  simple 
order  <  on  (0,°°).  Thus,  Mx  is  a  natural  generalization  of  the  class  of  single-scale  policies. 
In  addition,  members  of  Mx  are  “sensible”  policies  from  the  standpoint  of 
implementation  when  failure  characteristics  are  captured  by  the  two  time  scales.  In  the 
literature,  Murthy  et  al  (1995)  use  rectangular,  triangular,  other  planar  regions  as 
warranty  policies;  every  region  they  consider  is  a  lower  set  with  respect  to  the  matrix 
partial  order  on  (0,°°)2.  Similarly,  the  policies  developed  in  Case  Studies  1  and  2  above 
are  members  of  Mx,  but  the  policies  described  in  Case  Study  3  are  not.  For  ages 
measured  in  k  >  2  scales,  the  notation  is  easily  extended  so  that  Mx is  the  class  of  open 
lower  sets  in  Jc  (0,°°/  with  respect  to  the  matrix  partial  order  generalized  to  (0,°o)*. 

C.  NESTED  POLICIES 

Let  r  =  KIC  denote  the  ratio  of  the  preventive  replacement  cost  and  the  additional 
cost  to  replace  a  device  due  to  failure.  As  r  decreases,  it  becomes  proportionally  more 
costly  to  replace  at  failure,  so  the  replacement  age  based  on  a  single  scale  should  be  more 
conservative.  To  show  this,  we  make  explicit  the  dependence  on  cost  ratio  r  and  define 
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D(r;r)  =  C(r)/C 


(2.1) 


where  C(t)  is  the  cost  function  in  (1.1).  Let  v=  inf{x:  S(x )  =  0};  v<  °°.  Then,  for  cost 
ratio  s  <  r, 


D(r;r)-D(r;s)  =  ■— — 

Jo  S(u)du 


(2.2) 


is  a  positive,  continuous,  and  strictly-decreasing  function  of  T  on  (0,  v ).  Suppose  D(f,s), 
and  hence  C(t)  with  cost  ratio  5,  attains  a  minimum  at  -^(s);  there  may  be  several 
minima.  It  can  be  shown  that  1*(s)  <  V.  For  T  <  7*(s), 

D(T;r)-D(i*(s);r)  =  [D(T;r)-D(r;s)]  +  [D(r;s) -D(i*(s);s)]  + 

[D(?*(s);s) -D(r*(^);r)].  (2.3) 

Since  r*( s )  minimizes  D(r\s),  the  second  term  on  the  right-hand  side  of  (2.3)  is  non¬ 
negative;  in  addition,  because  (2.2)  is  strictly  decreasing  on  (0,  v),  the  sum  of  the  first  and 
third  terms  is  positive.  Thus,  D(r;r)  >  Vt<  t*(s)  ,  and  it  follows  that  C(f) 

with  cost  ratio  r  can  only  attain  a  minimum  for  T  > 

Hence,  for  a  decreasing  sequence  of  cost  ratios  n,  r2,  ...  ,  the  corresponding 
single-scale  policies  are  nested.  That  is,  if  the  corresponding  optimal  replacement  times 
are,  respectively,  T\,  ri,...,  then  we  know  t\  >  t2  >  ...  ,  so  that  the  policies 
(0,fj)  3  (0,T2)  3  ...  form  a  sequence  of  nested  lower  sets. 
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Multiple-scale  age  replacement  policies  should  also  be  more  conservative  as  r 
decreases;  in  particular,  policies  for  smaller  r  should  be  subsets  of  those  for  larger  r.  Let 
X=  (0,°o)2.  Consider  the  policies  based  on  region  M\  =  {(x,y(x)):  x  +  y(x)  <  6,  x  >  0}  for 
n  =  1  and  M2  =  {(x,y(.x)):  5x  +  y(x)  <  15,  x  >  0}  for  r2  =  0.5.  Note  both  M\  and  M2  are  in 
Mx.  Now,  consider  a  device  with  linear  usage  path  y(x)  =  5x;  the  policies  and  usage  path 
are  depicted  in  Figure  2.6.  This  example  illustrates  that  non-nested  multiple-scale  policies 
can  prescribe  replacement  times  that  are  not  sensible.  With  r\  =  1,  the  additional  cost  to 
replace  a  device  due  to  failure  is  equal  to  the  preventive  replacement  cost,  while  r2  =  0.5 
means  the  additional  cost  to  replace  a  device  due  to  failure  is  twice  the  preventive 
replacement  cost.  Thus,  it  seems  policy  M2  should  hedge  against  this  higher  failure 
replacement  cost  and  suggest  replacement  at  an  earlier  time  than  the  time  suggested  by 
policy  M\ . 
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Figure  2.6:  Non-nested  Policies. 

Solid  lines  represent  boundaries  of  policies  M^  and  Mi  and  the  dashed  line 
represents  a  linear  usage  path  of  slope  5.  Under  policy  M1f  non-failed 
devices  on  this  path  are  replaced  when  x=1;  under  policy  M2,  non-failed 
devices  on  this  path  are  replaced  when  x- 1.5. 
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III.  POLICIES  BASED  ON  COMBINED  SCALES 


Due  to  the  complexity  of  modeling  lifetimes  in  multiple  scales,  much  of  the  recent 
work  in  this  area  focuses  on  finding  appropriate  methods  for  combining  scales  to  form  a 
single  time  scale.  Once  such  a  combined  scale  is  found,  reliability  tools  such  as  age 
replacement  theory  can  be  brought  to  bear.  We  begin  with  a  general  discussion  of 
combined  scales.  We  then  consider  in  detail  three  combined  time  scales  in  the  literature 
that  seem  best  suited  for  age  replacement  policies  given  failure  data  in  two  scales,  age 
and  usage.  The  first,  and  in  a  sense  closest  in  spirit  to  our  efforts,  is  the  work  of 
Kordonsky  and  Gertsbakh  (1994)  in  which  a  combined  scale  is  found  for  age 
replacement.  The  next  two  scales  discussed  are  the  “minimum  CV”  scale  of  Kordonsky 
and  Gertsbakh  (1993,  1995,  1997)  and  the  “ideal”  time  scale  of  Duchesne  and  Lawless 
(2000).  Both  of  these  time  scales  are  based  solely  on  the  underlying  failure  models  and 
are  developed  independently  of  the  age  replacement  problem.  However,  Gertsbakh  and 
Kordonsky  (1997)  do  suggest  a  context  in  which  their  min  CV  scale  is  “optimal”  for 
preventive  maintenance  and  Duchesne  (1999)  suggests  his  scales  might  be  useful  for 
maintenance  planning. 

A.  COMBINED  TIME  SCALES 

A  formal  definition  of  “time  scale”  is  given  by  Duchesne  and  Lawless  (2000). 

Let  the  set  of  all  device  usage  paths  Z(x)  be  Z(x).  For  a  particular  device,  let  the  “whole” 
usage  path  be  Z  =  Z(°°);  let  the  set  of  all  such  paths  be  Z  =  Z(°°).  A  time  scale  <P(x,Z(x)) 
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is  a  non-negative  real-valued  functional  of  x  and  the  path  Z  up  to  age  x;  it  is  required  to 
be  non-decreasing  in  x  for  all  Z  in  Z.  Hence,  a  time  scale  is  a  function  of  chronological 
time  and  external  covariates.  Recent  research  efforts  focus  on  finding  a  time  scale  for 
which  tz(x)  =  <E>(jc,Z(jc»  suffices  for  the  calculation  of  probabilities  for  failures  modeled  in 
two  scales.  Oakes  (1995)  introduces  the  notion  of  the  “collapsibility”  of  two  time  scales 
into  one  time  scale  which  is  “fully  informative”  in  the  sense  that  the  probability  of 
survival  to  a  specified  point  (in  the  plane)  depends  only  on  the  location  of  the  point,  not 
on  the  path  taken  to  get  to  the  point.  Specifically,  following  Duchesne  and  Lawless 
(2000),  the  distribution  of  XI Z  is  “collapsible  in  y(x)”  if  the  survival  probability  at  time  x 
depends  only  on  the  path  Z  up  to  x  only  through  its  endpoint  (x,y(x)).  Thus,  a  time  scale 
for  a  collapsible  model  can  be  written  as  tz(x)  =  <b(x,y(x)).  Collapsible  models  are 
common  in  the  literature  since  in  many  cases  X  and  Y  =  y(X)  are  observable  but  the 
history  Z(X)  is  unknown.  If  the  usage  path  is  approximated  by  a  straight  line,  the 
resulting  models  are  collapsible  since,  y(x)  =  6x  and  hence  the  path  Z  is  known  by  its 
value  y(x)  at  any  time  x. 

To  illustrate  the  consequences  of  combining  time  scales  in  a  collapsible  model, 
consider  the  time  scale  t  =  x  +  gy(x)  for  some  g  >  0.  Note  t  induces  a  family  of  contours 
[y  =  (t- x)/g,  t  e  (0,oo)},  as  depicted  in  Figure  3.1  (Duchesne,  1999).  The  points  where 
the  usage  paths  intersect  a  given  dotted  contour  line  all  have  the  same  age  (in  the 
combined  scale). 
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Figure  3.1 :  Contours  of  Linear  Scale  in  a  Collapsible  Model. 

Jagged  lines  represent  device  usage  paths  and  dashed  lines  represent 
contours  of  a  linear  time  scale.  Reproduced  from  Duchesne  (1999). 

This  family  of  contours  provides  a  means  to  compare  points  on  different  usage 
paths  that  may  be  non-comparable  with  respect  to  matrix  partial  order.  Consider  the 
points  of  intersection  of  contour  t  =  4  with  the  four  usage  paths  in  Figure  3.1.  The  matrix 
partial  order  does  not  enable  us  to  determine  the  relative  ‘  age  of  devices  having  age  and 
usage  represented  by  these  points.  On  the  other  hand,  the  four  points  have  the  same  age 
in  scale  t.  Thus,  the  combined  scale  t  induces  an  ordering  (by  age  in  this  scale)  of  a  set  of 
points  (xj,y(xi)),  (x2,y(*2)),  •••  » (xn,y(x, ,)).  In  addition,  as  illustrated  by  the  contours,  the 
scale  provides  a  means  of  specifying  the  relative  age  of  one  device  in  relation  to  another. 

Different  time  scales  order  and  “space”  a  given  set  of  lifetimes  differently.  To 
illustrate  this,  consider  Figure  3.2.  Figure  3.2  contains  a  scatterplot  of  labeled  points 
(x,,yO,  (x2,y2), ...  ,  (*io,yio),  randomly  generated  from  the  unit  square;  lines  of  slope  -1 
correspond  to  contours  of  scale  t  =  x  +  y(x)  and  lines  of  slope  -10  correspond  to  contours 
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of  scale  s  =  x  +  0.1y(x).  Table  3.1  lists  the  coordinates  of  the  points,  their  “age”  in  the 
two  scales,  and  their  ranks  r(t)  and  r(s)  in  the  two  scales  t  and  s. 


x 


Figure  3.2:  Contours  of  Linear  Scales  f  and  s. 

Labeled  points  are  randomly  generated  from  the  unit  square.  Lines  of 
slope  - 10  and  -1  are  contours  of  linear  time  scales  s  and  t,  respectively. 
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Table  3.1:  The  “Action”  of  Two  Different  Time  Scales. 

This  table  summarizes  some  of  the  information  in  Figure  3.2.  Row  7 
indicates  (x7,y7)  has  age  0.84  in  scale  t,  age  0.65  in  scale  s,  is  the  fourth 
“youngest”  point  in  scale  t,  and  is  the  eighth  “youngest”  in  scale  s. 
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Using  the  combined  scale,  an  age  replacement  policy  can  be  expressed  as  (0,t). 

In  this  form,  a  policy  may  have  limited  utility  to  the  practitioner.  On  the  other  hand,  a 
graphical  depiction  of  this  policy  in  terms  of  the  original  scales  age  and  usage  is  very 
useful.  In  the  original  scales,  the  t-scale  policy  (0,t)  is  equivalent  to 
M  -  {(x,y(x)):  0(x,y(x))  <  t}.  For  example,  policy  (0,0.4)  in  scale  t  above  “translates  to 
the  policy  { (x,y(x))\  x  +  y(x)  <  0.4}  in  Figure  3.2.  In  fact,  the  policy  Mx  discussed  in 
Chapter  II  is  a  special  case  of  such  a  “translation”;  in  this  case,  the  combined  scale  is 
simply  Jt.  For  most  combined  scales  found  in  practice,  an  age  replacement  policy  in  the 
combined  scale  corresponds  to  a  lower  set  in  the  original  scales.  This  is  only  the  case, 
however,  when  the  combined  time  scale  O  is  such  that  for  (*i,yi(xi))  and  (x2,y2(x2))  where 
x\  <  X2  and  yi(*i)  ^  y2(x2)  we  have  <E>(xi,yi(xi))  ^  <£(*2>),2(*2))-  1°  other  words,  since  time 
scales  are  by  definition  required  only  to  be  increasing  in  x  for  any  Z,  it  is  possible  to 
display  combined  scales  for  which  the  policy  in  the  original  scales  is  not  a  lower  set. 

B.  A  COMBINED  SCALE  FOR  AGE  REPLACEMENT 

Kordonsky  and  Gertsbakh  (1994)  find  the  “best”  scale  for  age  replacement  among 
the  family  of  scales  that  are  convex  combinations  of  the  two  scales  of  age  and  usage. 
They  consider  the  family  of  scales  {t(a)  =  (1-  a)x  +  ay(x),  a  e  [0,1]};  in  scale  t(a)  the 
lifetime  is  T(a)  =  (1-  a)X  +  aY.  The  geometric  interpretation  of  times  in  scale  t(a)  is 
insightful.  Time  t(a)  =  (1  -a)x  +  ay{x)  is  proportional  to  the  length  of  the  orthogonal 
projection  of  the  point  (x,  y(x))  onto  vector  (1-  a, a );  the  search  for  the  “best  scale  is 
essentially  a  search  for  the  “best”  such  vector  onto  which  to  project  the  data. 


31 


For  a  fixed  a,  let  Fa(t)  =  P(T(a )  <  t),  and  define 


CM-  /+CF-W  ,r>0. 


(3.1) 


Thus,  Ca(T)  is  identical  to  the  long-run  average  cost  function  (1.1).  To  find  the  “best”  a, 
it  seems  reasonable  to  find,  for  a  given  a,  the  optimal  replacement  time  in  this  scale  (say 
Ta),  and  then  search  [0,1]  for  the  a  yielding  minimal  Ca(Ta).  However,  Ca( Ta)  has 
dimension  cost  per  unit  of  time  in  the  scale  t(a).  Thus,  values  of  Ca(Ta)  must  be 
“converted”  to  make  them  comparable.  To  this  end,  Kordonsky  and  Gertsbakh  convert 
(3.1)  into  a  cost  function  with  dimension  cost  per  unit  of  chronological  time  in  the 
following  way.  Because  the  average  lifetime  in  scale  t(a)  is  E[T(a)}  and  the  average 
lifetime  in  chronological  time  x  is  E[X\,  then  from  a  damage  accumulation  perspective 
one  unit  of  “r(a)-time”  is  equivalent  to  E[X\  /E[T{a)]  units  of  x-time.  Hence,  the 
“converted”  cost  function  is 

Da(r)  =  Ca(r)E[T(a)]/E[X\,  r>  0.  (3.2) 

Let  ra  =  argmin  Da(t).  By  definition,  the  “best”  scale  corresponds  to  the  a*  which  yields 
the  minimum  value  of  Da(Ta). 

Kordonsky  and  Gertsbakh  estimate  a*  nonparametrically  based  on  a  simple 
random  sample  (Xi,Tt),  (X2,Y2), ... ,  (X„,Yn).  Care  needs  to  be  taken  in  applying  their 
method,  however.  Consider  the  auto  data  set,  taken  from  Wilson  (1993),  and  which  can 
be  found  in  Appendix  B.  The  boundaries  of  the  policies  for  cost  ratios  r  =  0.5,  0.25,  and 
0.125  are  depicted  in  Figure  3.3.  The  policies  are  lower  sets,  but  they  exhibit  the  non¬ 
nested  behavior  exhibited  in  Figure  2.6.  This  suggests  that  the  “best”  scale  is  a  function 
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of  the  cost  ratio.  For  the  metal  data,  however,  the  policies  are  nested  for  { r:  0  <  r  <  1 } . 
We  suspect  the  non-nestedness  of  the  policies  derived  from  the  auto  data  may  be  caused, 
in  part,  by  the  lack  of  sufficient  spread  in  the  distribution  of  usage  paths.  As  such,  it  can 
be  argued  that  non-nestedness  is  exhibited  here  since  most  observations  in  the  auto  data 
set  fall  roughly  along  a  single  regression  line  fit  through  the  origin  (unlike  the  metal 
data). 


Figure  3.3:  Non-nested  Policies  for  Auto  Data  Based  on  “Best  Scale”  Method. 
Triangles  represent  the  number  of  days  and  miles  until  a  failure  occurred. 
Labeled  lines  are  policy  boundaries  for  cost  ratios  r=  0.5,  0.25,  and  0.125. 


C.  POLICIES  BASED  ON  MINIMUM  CV  SCALE 

We  now  examine  another  combined  scale  on  which  policies  can  be  based. 
Consider  again  the  family  of  linear  scales  Ta  =  {t{a)  =  (1-  a)x  +  ay(x),  a  e  [0,1]}.  Let 
CV[F(a)]  denote  the  coefficient  of  variation  of  the  lifetime  in  scale  t(a);  Kordonsky  and 
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Gertsbakh  (1993,  1995,  1997)  identify  the  scale  having  a *  minimizing  CV[7(a)].  They 
prove  the  (unrestricted)  minimizer  of  CV2[T(a)]  has  a*  =  g*/(l+g*),  where 


£[7]yflr[X]-£[X]Cov(X,y) 
E[X]Var[Y]-E[Y]Cov(X,Y)  ' 


Since  the  family  of  scales  specifies  a  e  [0,1],  it  is  important  to  describe  the  cases 
leading  to  a*£  [0,1].  In  fact,  from  (3.3)  we  can  show  that  a*  <£  [0,1]  iff  either  Case  A  or 
CaseB  holds  in  (3.4): 


CaseA\CV\X)< 
Case  B :  CV 2  (Y)<- 


CoidXT)  <cy2 
E[X]E[Y] 

^£X^LiI1<cv2(X). 

E[X]E[Y] 


(3.4) 


In  practice,  an  estimate  a  *  of  a*  is  obtained  by  replacing  each  of  the  terms  in  (3.3)  with 
its  sample  estimate;  Cases  A  and  B  are  modified  accordingly.  Duchesne  and  Lawless 
(2000)  note  that  when  Case  A  holds,  the  minimizer  of  CV2[T(a)]  in  Ta  has  a*  =  0,  so  that 
t  =  x  is  the  min  CV  scale.  When  Case  B  holds,  a*  =  1,  t  =  y(x)  is  the  min  CV  scale. 

Consider  using  the  min  CV  scale  to  construct  a  multiple-scale  age  replacement 
policy  based  on  a  simple  random  sample  (Xi,y0,  (X2,Y2), ... ,  (X„,Y„).  If  the  sample 
version  of  Case  A  holds,  the  policy  is  Mx  (as  in  Case  Study  1  of  Chapter  II).  This  means 
that  if  we  use  “min  CV”  as  the  criterion  for  time  scale  selection,  it  suffices  to  base  the 
policy  solely  on  the  distribution  of  chronological  time  at  failure.  Similarly,  if  Case  B 
holds,  it  suffices  to  base  the  policy  solely  on  the  distribution  of  cumulative  usage  at 
failure.  Gertsbakh  and  Kordonsky  (1997)  note  that  if  Co v(X,Y)  <  0,  then  a*  e  [0,1],  so 
neither  Case  A  nor  Case  B  can  occur.  In  this  “more  interesting”  situation,  we  often  find 
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0  <  a*  <  1  (we  note  it  is  possible  for  a*  to  be  0  or  1  if  Cov(X,F)  <  0).  In  this  case, 
policies  for  a  decreasing  sequence  of  ratios  r  form  a  sequence  of  nested  right  triangles. 
For  example,  consider  the  metal  data  set.  From  the  sample  version  of  (3.3)  we  find 
a*  =  0.871,  so  the  min  CV  scale  is  t  =  0.129*  +  0.87 ly(x).  Using  (1.3)  in  this  scale  we 
find  the  replacement  time  for  0.7  <  r  <  1  is  3984;  for  0.594  <  r  <  0.7  the  replacement  time 
is  3801;  and  for  r  <  0.594  the  replacement  time  is  3396.  These  replacement  times  induce 
the  set  of  nested  right  triangles  depicted  in  Figure  3.4. 


Figure  3.4:  Nested  Policies  for  Metal  Data. 

Dashed  lines  represent  policy  boundaries,  based  on  the  min  CV  scale.  The 
policy  for  r<  0.594  is  nested  within  the  policy  for  0.594  <  r<  0.7,  which  is  in 
turn  nested  within  the  policy  for  0.7  <r<  1 . 
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D.  POLICIES  BASED  ON  IDEAL  TIME  SCALE 


The  long-run  average  cost  C(t)  of  a  single-scale  age  replacement  policy  (0,t)  is 
given  in  (1.1);  z*  minimizes  this  expression.  Using  the  transformation  p  -  F(r),  with 
F  1  (p)=sup { x:  F(x)  <p},  equation  ( 1 . 1 )  can  be  rewritten  as 


C(p)  = 


K  +  Cp 


,  0<p<\. 


(3.5) 


Solving  for  p*  to  minimize  C(p)  in  (3.5)  and  for  -r*  in  (1.1)  are  identical  problems;  the 
total  time  on  test  approach  to  solving  the  age  replacement  problem  is  based  on  this 
transformation.  Thus,  1*  is  the  p*-quantile  of  the  lifetime  distribution  F.  This  latter 
formulation  of  the  age  replacement  problem  is  insightful  since  it  indicates  that,  under  the 
policy,  a  device  has  probability  p*  of  failure  before  replacement.  Thus,  a  “natural” 
generalization  of  policy  (0,  t)  is  a  multiple-scale  policy  for  which  the  probability  of 
failure  before  replacement  is  the  same  (say  p)  regardless  of  the  path.  With  broader 
applications  in  mind,  Duchesne  and  Lawless  (2000)  introduce  an  “ideal”  time  scale  (ITS) 
which  might  be  used  to  find  such  a  policy. 

Duchesne  and  Lawless  (2000)  motivate  their  definition  of  an  ITS  as  follows.  If  a 
single-scale  tz(x)  =  0(x,Z(x))  suffices  for  the  calculation  of  failure  probabilities,  then  the 
distribution  of  T  =  <&(X,Z(X))  along  each  Z  should  be  independent  of  Z.  That  is, 

P[T  >  1 1 Z]  =  P[T  >  t\  =  G{t),  and  G(  )  does  not  depend  upon  Z.  In  addition,  tz(x)  must 
change  whenever  the  conditional  survivor  functions  S0(x,  Z(x))  =  P[X  >x\Z]  change. 
Duchesne  and  Lawless  define  tz(x)  =  <J>(x,Z(x))  to  be  an  ideal  time  scale  if  it  is  a  one-to- 
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one  function  of  S0(x,  Z{x)).  In  this  case,  P[X  >x\Z]  =  G[fz(x)]  =  P[T  >  tz(x)] .  Duchesne 
(1999)  explains,  “an  ITS  is  a  time  scale  in  which  we  can  directly  compare  the  lifetimes  of 
all  the  devices  under  study,  no  matter  what  their  usage  patterns  are  ...  it  is  ‘ideal’  in  the 
sense  that  the  age  in  the  ITS  is  the  only  information  needed  to  compute  P[X  >  x  \  Z],  so  it 
is  ‘sufficient’  for  computing  the  age  of  the  units.” 

In  fact,  Duchesne  (1999)  mentions  maintenance  and  inspection  policies  as 
potential  applications  of  his  ITS  concept,  and  gives  the  following  example.  Suppose  we 
want  to  inspect  devices  when  their  probability  of  failure  is  0.25,  regardless  of  the  path. 
Suppose  t  =  x  +  5 y(x)  is  an  ITS;  let  T  denote  the  lifetime  of  a  device  in  scale  t  and  t. 25 
denote  the  25th  percentile  of  that  lifetime  distribution.  If  /  25  =  100,  devices  should  be 
inspected  whenever  x  +  5yfr)  =  100.  Duchesne  (1999)  notes  that  ITSs  are,  by  definition, 
unique  up  to  one-to-one  transformations.  Hence,  if  t  defines  an  ITS  and  \\f  is  a  strictly 
increasing  continuous  function  with  V|/(0)  =  0  and  \|/(°°)  =  00 ,  then,  u.  =  V|/(f)  is  also  an  ITS. 
Thus,  for  example,  u  =  t2  =  (x  +  5 y(x))2  is  also  an  ITS;  let  U  denote  lifetime  in  this  scale. 
Since  Pr (U  <  1002)  =  Pr(T  <  100)  =  0.25,  we  have  u,2 5  =  1002.  Thus,  devices  should  be 
replaced  whenever  (x  +  5y(x))2  =  1002,  which  is  identical  to  the  policy  based  upon  scale  t 
as  defined  above.  This  is  a  simple  consequence  of  the  monotone  transformation. 

Similarly,  it  seems  we  should  be  able  to  obtain  a  path-independent  age 
replacement  policy  by  finding  the  policy  in  any  ITS  and  transforming  this  interval  to  a 
region  in  the  positive  quadrant  (as  described  in  section  A  above).  There  is  a  problem, 
however,  stemming  from  the  non-uniqueness  of  the  ITS.  Suppose  T  has  an  exponential 
distribution.  It  is  well  known  that  the  optimal  replacement  time  is  infinite,  so  the  policy 
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in  this  scale  would  be  to  replace  only  at  failure.  The  /-scale  policy  (0,°°)  translates  to  the 
entire  positive  quadrant.  Now,  consider  the  policy  based  on  scale  u  =  tm:  U  would  then 
have  a  Weibull  distribution,  and  the  policy  in  scale  u  would  be  (0,v)  for  some  v  < 
Translating  to  the  plane  results  in  the  region  {(x,y(x)):  (x  +  5y(x))l/2  <  v}  which  differs 
from  the  policy  based  on  scale  t. 

To  illustrate  this,  consider  the  metal  fatigue  data  discussed  in  Case  Study  3  of 
Chapter  II.  Duchesne  and  Lawless  (2000)  show  that  scale  t  =  x  +  6.1  y(x)  is  a  reasonable 
approximation  to  the  true,  unknown  ITS.  Let  T denote  the  lifetime  in  this  scale;  we  first 
“reduce”  each  pair  (x,  y(x))  to  scale  t .  Then,  upon  estimating  Fj{t)  =  P(T  <  t)  with  the 
empirical  distribution,  we  find  that  for  r  =  0.5,  the  minimizer  of  (1.3)  is  f  =  26125.  The 
ITS  interval  (0,26125)  corresponds  to  the  region  Mr  =  {(x,y(x)):  x  +  6.7  y(x)  <  26125}. 
The  boundary  of  this  policy  is  the  solid  line  in  Figure  3.5.  Under  this  policy,  we  replace 
the  device  upon  failure  or  when  the  sum  of  its  accumulated  low  cycles  and  6.7  times  its 
accumulated  high  cycles  reaches  26125. 
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Figure  3.5:  Policies  Based  on  Ideal  Scales  t  and  u. 

The  solid  line  represents  the  policy  boundary  for  r  =  0.5  based  on  scale  t 
and  the  dashed  line  represents  the  policy  boundary  for  r  =  0.5  based  on 
scale  u. 


We  now  construct  the  age  replacement  policy  for  this  data  using  another  ITS.  If 
t  =  x  +  6Jy(x)  is  ideal  for  the  metal  data,  then  the  monotone  transformation  u  =  t2  is  also 
ideal.  Proceeding  as  above,  upon  calculating  the  failure  times  U  we  find  the  minimizer  of 
equation  (1.3)  is  v  =  407602.  In  the  plane,  the  ITS  interval  (0,407602)  corresponds  to  the 
region  Mu  =  {(*,  y(x)):  x  +  6.7  y(x)  <  40760}.  The  boundary  of  this  region  is  the  dashed 
line  in  Figure  3.5.  Observe  Mu  is  not  the  same  as  Mr,  the  region  derived  from  the  first 
ideal  scale. 

In  summary,  path-independent,  fixed-probability-of-failure  inspection  policies  can 
be  based  on  an  ITS,  but  basing  an  age  replacement  policy  on  an  ITS  can  pose  significant 
problems.  The  reason  ideal  scales  pose  problems  for  age  replacement  but  not  fixed- 
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probability  inspection  policies  relates  to  our  discussion  of  the  ordering  and  “spacing” 
action  of  combined  scales.  An  ITS  O,  like  other  combined  scales,  orders  and  induces 
spacings  between  the  failure  times.  A  monotone  function  of  9?  maintains  the  ordering 
of  the  times  given  by  ITS  O,  but  the  spacings  change.  This  fundamentally  changes  the 
nature  of  the  failure  distribution  on  which  the  optimal  age  replacement  policy  depends. 
(An  obvious  exception  is  when  \(/  is  linear;  see  Lemma  A.  1  in  Appendix  A.)  More 
specifically,  let  T  and  U  denote  the  lifetimes  in  scales  0>  and  \|/(0),  respectively;  let  1* 
and  v*  denote  optimal  replacement  times  in  these  scales.  The  observation  above  is  that 
although  U  =  \j/(7),  it  is  not  necessarily  true  that  v*  =  !)/(?*).  This  is  due  to  the  fact  that  in 
transforming  the  cost  function  (1.1)  from  scale  t  to  scale  u,  the  numerator  remains 
constant  but  the  denominator  changes. 

E.  DISCUSSION  AND  SUMMARY 

In  this  chapter  we  have  discussed  how  a  multiple-scale  age  replacement  policy 
might  be  obtained  if  scales  age  and  usage  are  combined  in  various  ways.  One  method  of 
Kordonsky  and  Gertsbakh  (1994)  is  motivated  from  the  standpoint  of  cost.  For  a  fixed 
r>  0,  this  method  finds  the  “best”  vector  (1-  a,  a )  on  which  to  project  the  data  based  on  a 
“converted”  cost  function;  the  resulting  policy  Mr  is  triangular  (or  possibly  of  form  Mx  or 
My).  However,  we  note  for  s  <  r  the  method  is  not  guaranteed  to  have  Ms  a  Mr\  this  is 
because  the  “best”  scale  depends  on  the  cost  ratio.  For  a  fixed  r  >  0,  policies  based  on 
the  min  CV  scale  are  triangular  (or  possibly  Mx  or  My)  and  since  minimizing  CV  results 
in  a  vector  (1-  a, a)  independent  of  r,  the  policies  for  a  decreasing  sequence  of  cost  ratios 
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are  nested.  Finally,  we  note  that  if,  based  on  the  failure  data,  a  reasonable  estimate  of  the 
ITS  can  be  found,  a  policy  in  this  scale  has  the  property  of  fixed  probability  of  failure 
before  replacement,  regardless  of  the  path.  While  this  property  is  attractive,  we  note 
monotone  transformations  of  the  ITS  are  also  ideal,  but  do  not  necessarily  result  in  the 
same  policy  as  in  the  original  ITS. 

Combining  scales  is  convenient  in  that  it  allows  analysis  to  proceed  along  one 
scale.  There  is  a  drawback  to  the  combining  of  scales,  however.  Kordonsky  and 
Gertsbakh  (1995)  explain  how  damages  in  the  different  time  scales  can  interact:  in 
aviation,  corrosion  (as  reflected  by  the  time  scale  “calendar  time”)  affects  both  fatigue 
damages  due  to  the  amount  of  time  in  level  flight  (as  reflected  by  the  time  scale  flight 
hours”)  and  the  high-amplitude  stresses  incurred  during  the  takeoff  and  landing  cycle  (as 
reflected  by  the  time  scale  “number  of  landings”).  As  such,  they  observe  “No  single  time 
scale  is  sufficient  for  a  complete  description  of  all  wear  and  damage  accumulation 
leading  to  failure  in  one  of  the  aircraft  parts.”  Thus,  useful  information  may  be  lost  even 
if  the  “best”  single  time  scale  is  used  (i.e.,  the  one  which  best  accounts  for  the  damage 
accumulation  processes  and  their  interaction);  for  this  reason,  we  proceed  to  the 
introduction  of  new  methods  which  do  not  combine  the  scales. 
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IV.  POLICIES  GIVEN  DATA  ALONG  SEVERAL  LINEAR  PATHS 


In  this  chapter  we  generalize  the  single-scale  failure  replacement  interval  (0,t)  to 
the  multiple-scale  setting  in  which  failure  data  fall  along  several  linear  paths.  Such 
situations  often  arise  in  modeling  real-world  observational  lifetime  data  in  multiple  scales 
(e.g.,  Gertsbakh  and  Kordonsky,  1998  and  Lawless  et  al.,  1995).  In  many  cases  X  and  Y 
are  known  but  the  usage  curve  Z  is  unknown  and  is  approximated  by  a  straight  line. 

Linear  usage  paths  may  also  arise  by  cyclic  usage  in  fatigue  life  experiments  (as 
exemplified  in  the  metal  data).  The  development  is  as  follows.  First,  we  establish 
notation  to  be  used  throughout  the  chapter.  In  so  doing,  we  describe  the  cost  function 
used  to  define  an  “optimal”  policy  in  this  setting.  Next,  we  explain  how  to  estimate  the 
optimal  policy  for  given  failure  data,  and  present  an  example.  We  then  compare  our 
approach  to  the  methods  found  in  the  literature,  and  summarize. 

A.  “COMPOSITE”  POLICIES 

Consider  a  population  of  devices  differing  only  in  their  rates  of  use,  which 
remains  constant  throughout  their  lifetimes.  Thus,  suppose  that  upon  entering  service,  a 
device  is  assigned  a  linear  path  Z,  (characterized  by  its  slope  6,)  with  probability  p„ 
i=  1, ...  ,  m.  Suppose  also  that  0  <  Q\  <  (h  <  . . .  <  6m  <  Let  F,  be  the  distribution  of 
lifetime  X  (in  chronological  time)  given  0-  6i,  i=  1, ... ,  m;  as  in  Chapter  I,  F,(x)  =  P(X  < 
x\6=  0i).  From  (1.1)  the  long-run  average  cost  per  unit  time  for  a  device  operating  with 
6=  6i  under  policy  (0,tj)  is 
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Ci(ti)  =  ^+cy/(T<) ,  Ij  >  o,  i  =  1, ,  w.  (4.1) 

l  St{u)du 

Let  Ti*  be  an  optimal  age  replacement  time  for  devices  on  path  i,  i  =  1, ... ,  m;  that  is, 

Ti*  =  argmin  To  form  a  composite  policy  from  the  path-specific  policies  (0,r;*)  for 

/  =  1, . . .  ,  m,  let  Mt*  -  { (xAx):  0  <  x  <  Ti*,  i  =  1 , . . .  ,  m) .  This  composite  policy  has 
replacement  times  summarized  by  the  vector  (Ti*,  T2*, ... ,  tm*),  meaning  devices  on 
path  Z,  are  replaced  upon  failure  or  when  their  age  reaches  Ti*  (whichever  occurs  first), 
i  =  1, ... ,  m.  As  in  Case  Study  1  of  Chapter  II,  since  at  any  given  chronological  time 
x  >  0,  the  position  of  a  device  along  its  usage  path  is  known,  we  can  specify  the 
replacement  times  solely  in  terms  of  chronological  time. 

In  Case  Study  3  of  Chapter  II,  for  the  metal  data,  estimation  of  replacement  times 
for  such  a  composite  policy  did  not  result  in  a  sensible  policy.  More  specifically,  with 
0  =  {0.053, 0.250,  0.667, 1.5,4,  19}  and^*=  {(*,$*):  Oct,  0,  in  0,  i  =  1, ...  ,6},  the 
composite  policy  with  replacement  times  (23580,  10300,  5700,  3200,  1000,  275)  does  not 
correspond  to  a  region  which  is  a  lower  set  in  Mx-  We  now  give  conditions  on  a 
replacement  time  vector  (Ti,  T2, ...  ,  tm)  that  ensure  Mt is  a  lower  set. 

Proposition  4.1.  A  composite  policy  Mr  =  { ft, djx):  0  <  x  <  Ti,  i  =  1 , . . . ,  m}  for 
devices  on  linear  usage  paths  where  0  <  6\  <  61  <  . . .  <  6m  is  a  lower  set  with  respect  to 
the  matrix  partial  order  on  X=  {{x,6ix):  0  <  x,  i  =  1, ...  ,  m)  if  and  only  if  both  Ti+ 1  <  V 
and  0i+iTi+\  >  OiTi,  i=l,...,  m-  1. 
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Proof:  Starting  with  the  reverse  statement,  let  jc  €  Mr  and  let  jy  e  JTsuch  that 
y  -<x.  To  show  MT  is  a  lower  set  with  respect  to  the  matrix  partial  order  on  X,  it  suffices 
to  show  y  e  Mr.  Because  x  e  Mr,  the  age  x  =  ( t ,  djt )  for  some  0  <t<  Tj  and  some  dj  in  0. 
Similarly,  because  y  e  X,y  =  ( s,8ks )  for  some  s  >  0  and  some  dk  in  0.  Because  y  -<  x,  it 
follows  that  s  <  t  and  <  djt.  It  suffices  to  show  0  <  s  <  ?*.  First,  treat  the  case  k  <  j. 
Because  s  <  t  and  Tj  <  Tk,  we  have  0  <s<t  <  Tj<  Tk.  On  the  other  hand,  if  k  >  j,  then 
because  9ks  <  djt  and  djTj  <  dktk,  we  have  0  <  s  <  (d/dk)t  <  (d/dk)Tj  <  tk.  Thus,  the  policy 
is  a  lower  set. 

Turning  to  the  direct  statement,  suppose  Mr  is  a  lower  set;  let  i  e  { 1 , . . . ,  m-  1 } . 
Suppose  further  that  Tm  >  T Let  x  =  (Tl+\  +  ri)/ 2;  consider  u  =  (x,di+ix)  e Mr  and 
v  =  (jc ,dix)e  X.  Note  that  v  -<  u,  but  because  x  >  v  &Mt.  This  contradicts  the  fact  that 
Mr is  a  lower  set.  Thus,  Tm  <  %  Similarly,  suppose  8mTm  <  djTj.  Lety  = 

(dj+iTj+i  +  djTj)/ 2,  x-yldj  and  z  =  yldl+\.  Consider  u  =  (x,y)  e Mr and  v  =  (z,  y)e  X.  Note 
that  v  -< «,  but  because  z  >  t/+i,  v  gMr ,  contradicting  the  fact  that  MT  is  a  lower  set.  Thus 
dj+\Tj+\  ^  djTj. 


This  proposition  reveals  the  problems  encountered  in  Case  Study  3  of  Chapter  II. 
The  policy  with  T-  (23580,  10300,  5700,  3200,  1000,  275)  has  85X5  <  84X4,  in  order  for 
Mr  to  be  in  Mx  we  need  85X5  >  84X4  (all  other  requirements  of  the  proposition  are 
satisfied).  Similarly,  the  policy  with  x=  (10000,  10300,  5700,  3200,  1200, 275)  has 
Xi>  Tj;  for  Mr  to  be  in  Mr  we  need  T2  <  Ti .  Thus,  for  the  metal  data,  the  hypothetical 


45 


policies  we  considered  in  Case  Study  3  are  not  lower  sets.  Several  ad  hoc  methods  can 
be  used  to  transform  these  policies  into  members  of  Mx.  For  one,  a  linear  interpolation 
can  be  used  to  “smooth”  sequential  members  of  T  which  violate  either  of  the  conditions 
Ti+ 1  <  Tj  or  $j+\  Tj+i  >  6iXi .  Another  alternative  is  to  use  a  pooling  scheme  (as  is  done  in 
isotonic  regression,  ref.  Robertson,  Wright,  and  Dykstra  1988)  to  transform  the  policy. 
However,  neither  of  these  schemes  takes  into  account  the  cost  of  implementing  the 
resulting  policy.  Since  it  is  desirable  to  obtain  a  sensible  policy  which  is  optimal  with 
respect  to  some  cost  function,  we  now  introduce  such  a  cost  function. 

B.  THE  COST  OF  A  COMPOSITE  POLICY 

The  first  policy  of  Case  Study  3  of  Chapter  II  is  “optimal”  in  the  sense  that  it 
minimizes  the  (estimated)  long-run  average  cost  per  unit  of  time  in  use  for  devices  on 
each  path  i=  1  Unfortunately,  the  policy  is  not  sensible  from  the  standpoint  of 

implementation.  We  need  a  means  of  obtaining  a  policy  that  is  “optimal”  in  a  sense 
which  accounts  for  costs  along  each  path,  but  is  simultaneously  “sensible.”  An  equitable 
method  of  calculating  the  cost  of  policy  Mr  with  corresponding  replacement  time  vector 
T=  (Ti,  T2, ...,  Tm)  is  to  form  the  average,  weighted  by  the  assigned  probabilities,  of  the 
costs  of  the  path-specific  policies:  let 

m 

C(r)  =5>,C,.( T,),  Tj>0,  i=  1, ...  ,m.  (4.2) 

1=1 

A  cost  function  of  this  form  is  studied  by  Gertsbakh  and  Kordonsky  (1997)  as  they 
address  the  “optimal”  time  scale  for  maintenance  in  heterogeneous  environments.  Here 


C(t )  represents  the  expected  long-run  average  cost  per  unit  of  time  in  use  of  maintaining 
a  device  under  a  policy  corresponding  to  its  operating  conditions.  The  dimension  of  C(r) 
is  in  units  of  cost  per  unit  of  (chronological)  time  in  use.  If  it  is  more  meaningful  to  the 
decision  maker,  equation  (4.2)  can  be  easily  transformed  so  it  has  dimension  units  of  cost 
per  unit  of  time  in  use  in  the  second  scale. 

From  Proposition  4.1,  we  note  that  in  order  for  a  policy  Mr  with  replacement  time 
vector  T=  (Tj,  Tj,  ...,  rm)  to  be  in  Mx,  rmust  lie  in  the  set  A,  defined  by 

A  =  { r g (0,°o)m:  -  >  T,  >  r2  >  ...  >rm> 0,  0,t,  <  £>r2  <  ...  <  0mrm}.  (4.3) 

Thus,  to  find  the  optimal  “sensible”  policy  for  a  given  r  >  0,  one  must  minimize  (4.2) 
subject  to  the  restriction  that  Tis  in  A.  Let  r*  denote  this  minimizer. 

For  a  given  r  >  0,  if  a  collection  of  conditional  distributions  {F,}  has 
(Ti*,  T2*,...,Tm*)  6  A,  then  by  the  optimality  of  each  n*  it  follows  that 
T*  =  (Ti*,  T2  regardless  of  the  mixing  probabilities.  Collections  of  distributions 

with  this  property  often  arise  from  models  common  in  the  literature.  Lawless,  et  al 
(1995)  study  failure  data  from  automobile  brake  pads  using  a  form  of  accelerated  failure 
time  model  in  which  they  form  a  time  scale  u  =  x^'71  y{x)v,  77  e  [0,1].  They  assume  linear 
usage  paths  y(x)  =  Ox ,  so  that  u  =  x8n,  and  they  fit  a  two-parameter  Weibull  distribution 
to  failure  times  in  scale  u.  Although  their  work  does  not  pertain  directly  to  age 
replacement  theory,  the  resulting  collection  of  distributions  of  XI 9  has  this  property. 
Duchesne  and  Lawless  (2000),  Gertsbakh  and  Kordonsky  (1998),  and  Oakes  (1995) 
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study  linear  time  scales  t  =  x  +  gy(x),  g>  0.  Under  a  linear  path  assumption,  time  scale  t 
takes  form  x(l  +  gd).  When  a  parametric  distribution  including  a  scale  parameter  is  fit  to 
failure  times  in  scale  t,  the  resulting  collection  of  distributions  of  X\9  has  this  property. 

In  certain  cases,  proportional  hazard  models  can  also  produce  collections  of  conditional 
distributions  with  this  property. 


C.  ESTIMATING  THE  OPTIMAL  COMPOSITE  POLICY 

We  now  turn  to  estimation  under  constraints  (4.3).  Assume  { F is  a  collection  of 

distributions  with  (Ti*,  r2*, ... ,  Tm*)  eA.  Following  (1.3),  let  S,  denote  the  empirical 

survivor  function  based  on  the  ordered  sample  chronological  lifetimes 

*,•(!)  <  x;(2)  <  ...  <  xi  („ )  from  path  i,  where  n,  is  the  number  of  observations  on  path  i,  and 


let 


C,(T,)  = 


(K  +  Q-CSfa) 
f*1  Sj(u)du 


,Ti>0,i=  1, ... ,  m. 


(4.4) 


Thus,  C,.(t;)  estimates  C,(r;).  The  following  is  the  analog  of  (1.3)  for  the  multiple-path 
scenario: 


C(T)  =  X”=i  Pi  (r,.),t;>0,/=l,...,m.  (4.' 

In  the  univariate  problem,  the  fact  that  the  empirical  cost  function  (1.3)  is  a 
piecewise  decreasing  function  reduces  the  search  for  the  minimum  to  a  finite  number  of 
“strategic”  points.  Similar  principles  apply  to  searching  for  a  minimizer  of  C  (t);  let  f 


48 


be  such  a  minimizer,  i.e.,  C  (f )  <  C  (r)  for  all  Tin  A.  We  now  describe  f  and  prove  that 
it  is  globally  optimal. 

For  convenience,  suppose  that  along  each  path  no  two  failure  times  are  equal,  so 
that  jcMi)  <  *M2)  <  . . .  <  xUni , ;  also  let  xm  =  0  and  xiX„i+l)  =  ~ ,  i  =  1 , . . . ,  m.  Form  an 


m-dimensional  grid 

r  =  X  {•*1,(1)  ’  *|',(2)  >"■’  */,(»,)  J  (4-6) 

1=1 

based  on  the  observations  along  each  path.  In  each  m-dimensional  hypercube  of  the  form 


m  (A  n\ 

H=  xKu).\a+i)]>  where 7/  e  {0, ... ,  n,},  i  =  1, ...  ,  m,  (4./) 

i=l 

C  (t)  is  decreasing  in  each  argument;  it  follows  that  the  minimum  of  C  (t)  in  H  occurs 
at  the  vertex  (xuh+]),x2(h+]y...,xmXjm+i)) .  Note  this  vertex  dominates  all  other  points  in 

H  with  respect  to  the  matrix  partial  order  on  (0,°°)m;  that  is, 

^  V  re  H.  Thus,  to  find  the  global  minimum  of  C(r)in 

the  absence  of  constraints,  we  evaluate  C  (t)  at  all  such  non-dominated  vertices  and 
select  the  one  yielding  the  smallest  cost.  In  the  presence  of  constraints  (4.3)  defining  set 
A,  it  seems  reasonable  to  limit  our  search  for  T  to  the  set  of  these  vertices  which  lie  in  A, 
but  it  can  be  shown  that  checking  only  such  vertices  will  not  necessarily  produce  the 
global  minimum.  Such  a  procedure,  though,  will  yield  a  point  corresponding  to  an  upper 
bound  for  the  optimal  cost. 
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Let  H  denote  the  set  of  all  hypercubes  H  as  in  (4.7)  for  which  HcA  *  0.  For 
some  H  in  this  set,  the  non-dominated  vertex  »•••«■*«.<;,+!,) lies  for 

others,  this  vertex  lies  outside  of  A.  In  the  latter  case,  the  non-dominated  point  inWnA 
(i.c.,  the  point  that  simultaneously  maximizes  the  value  of  each  coordinate)  yields  the 
smallest  value  of  C  ( r).  To  find  t ,  an  enumeration  procedure  is  utilized  to  find  the  non- 
dominated  point,  say  u(H),  in  HnA  for  all  H  in  H.  Then,  t  =  argmin C(u(H))  among 
all  H e  H.  For  each  He  H,  the  non-dominated  point  u(H)  is  constructed  explicitly  based 
on  the  following  results. 


Proposition  4.2.  For  any  x  =  (jcj,  jcj,  . . .  ,  -*„,)  in  (0,°o)"\  let 
Bx  =  { te  (0, <»)'":  r-<x).  Define  u(x)  as  follows:  u(x)  =  {ii\(x),  uz(x), ...  ,  u,„(x))  where 


ii,  (.v)  =  min'! 
u2(x)  =  min 


■ri> 


v  K 
e , 

•*2*77  *3 . 

^2 


f* 

&2 


X 


m 


Wy  (x)  =  min<  .  jc2  ,  xi , 


e. 


fa 

’  e. 


-  .v„, 


=  min{.v, ,  j:2 . xm}. 

Then,  (t)  u(jc)  e  An  Bx  and  (2  )y  <  u(x)  Vy  e  A  n  Bx. 
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Proof:  First,  u(x)  e  Bx  since  «,(*)  <*  Xj,  /  =  1, ,  m;  that  is,  u(x)  -<x.  To  show 

«(*)  e  A  it  suffices  to  show  diUl(x)  <  0MuM(x)  and  u,(x)  >  uM(x)  for  /=  1 . m  -  I. 

Let  i  G  {1, ... ,  m  -  1 }.  Since  0i+ 1  >  0, 


6jiti  (x)  -  6t  min 


01+1  & 

V  V  V  -  +l  V  v 

l’  2’-’ "  0,  *M’“ -*~x" 


v  '  / 

=  minftx,  ,0,x2 . eiXi,0MxM . 9mxm ) 

<  min(0,+lx, , 0,+1x2 . Ai+lx, , 0,+l xw ..... dmxm ) 

a  ■  f  0,,,  0 

=  A(+l  ininj  .v, ,  .v2 , . . . ,  x, ,  .vi+1  ,-^-xM . ^-x„ 


0 


1+1 


e. 


»+i 


0i+jm/+i  (*)» 


also 


«,(*)  =  min 


XX  X  —X  ^-r  r 

12 .  •’  0,  '+”  A 


>  mini  x, ,  x, , . . . ,  x, ,  x, . . ,  -^'+3 


M  *  v2 
-  M.+i  (*)• 


V  ’  -*(+1  »  a  A'r+2  )  •  • 
dM 


6, 


/+! 


Thus,  n(x)  e  AnBx,  proving  (1).  To  show  (2),  lety  €  A  nBx,  and  let  i  e  { 1, 

Since y  e  A,  y,  > y2  >  ...  > y«  and  6mym  >  ...>  0i+2yi+ 2  >  0/+,y(+l  £  Qy,,  then 
(6J9i)ym  >  ...  2:  ( 0, Lit 9<)}'j+2  ^  (fti]/ ^i)>v+i  £  so  by  definition  of  «(y),  u,(y)  =  y,.  It 
follows  that  u(y )  =  y.  Since  y  g  A,,  it  follows  thaty,  <.v„  /  =  I, ...  ,  m.  Since  each  «,(z)  is 
non-decreasing  in  each  argument  ofz  e  (0,°o)w,  we  havey  =  i/(y)  -<  u(x),  as  required. 


We  now  use  this  result  to  find  u(H)  for  H  g  H. 
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Proposition  4.3.  Let  H  as  in  (4.7)  be  a  member  of//;  let  x=  (*, ,  x2 xm )  denote 

the  vertex  (atua+I),  jr2-yi+„,...,^nii(^+|))  and  let  z~  (z,,z2 zm) denote  the  vertex 

(•*!.< * ) * xujt )’•••’ /„>)  •  Let  =  u(x)  as  in  Proposition  4.2.  Then  (l)y  <  u(H) 

Vy  e  An  H,  and  (2) «(//)  e  A  nH. 

Proof:  Let  u  =  (m,,w2  )  =  u(H).  Let y  e  AnH  (such  ay  exists,  since 

A  r>  H*0).  Since  H<zBx  it  follows  that  y  e  A  n  Bx\  from  Proposition  4.2  we  know  that 
y  ■<  thus  proving  (1).  By  (1).  we  have  y,  <«,,/=  1, ... .  m.  Since y  g  //,  we  know  that 
Z,<yi<x„i=  1, ... ,  m.  Because  u  ■< x,  we  know  u,  <xh  i=  1, ,  m.  From  these 
inequalities  it  follows  that  Zi  <  «,  <  */,  «  =  m,  so  that  u(H)  e  II  By  Proposition  4.2 
we  know  u(H)  e  A;  thus,  we  have  shown  (2). 

We  now  show  that  our  procedure  returns  the  global  minimum  of  C  (f). 

Theorem 4.1:  C(t)<,  C(r)Vre  A. 

Proof:  Let  re  A.  Because  the  grid  V  defines  a  partition  of  the  positive  orthant, 
re  H  for  some  H  e  H.  Form  u(H)  as  described  above.  By  definition,  C ( t )  <  C  (u(H)), 
so  it  remains  to  show  C  (u(H))  <  C(r).  By  construction  of  u{H)  we  have  v<  u(H):  in  H, 
C  (t)  is  decreasing  in  each  argument,  so  it  follows  that  C («(//))  <  C(r),  as  required. 
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D.  EXAMPLE 

Returning  to  the  metal  fatigue  data  from  Case  Study  3  of  Chapter  II,  Table  4.1 
contains  the  policy  vector  f  =  (f,  Jfor  r  =  0.5  along  with  the  values  6>f,  to 

amplify  the  fact  that  M;  is  a  member  of  In  the  policy  for  r  =  0.75,  f2  =  15200  so 

that  G2t2  =3800.  All  other  components  are  identical  to  the  policy  for  r  =  0.5.  The  policy 
for  r  =  1  is  identical  to  the  policy  for  r  =  0.75.  Thus,  for  this  data  the  procedure  produces 
nested  policies  for  these  values  of  the  cost  ratio.  Figure  4.1  contains  a  scattcrplot  of  the 
data  overlaid  with  line  segments  representing  paths  curtailed  by  their  corresponding 
replacement  times  for  r  =  0.5. 


i 

|  Slope  0j 

A 

1 

0.053 

23580 

1241 

2 

0.250 

r 10300 

2575 

3 

0.667 

5700 

1  3800 

4 

1.500 

2666.67 

r  4000 

5 

4.000 

1000 

4000 

6 

19.00  275 

5225 

Table  4.1:  Composite  Policy  for  the  Metal  Data,  r  ■  0.5. 

For  example,  row  5  Indicates  that  non-failed  devices  on  a  linear  usage  path 
of  siope4are  replaced  when  the  number  of  low-load  cycles  accrued 
reaches  1000.  At  this  time,  the  number  of  high-load  cycles  accrued  is  4000 
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Figure  4.1 :  Metal  Data  with  Policy  for  r  =  0.5. 

The  solid  lines  represent  the  failure  replacement  region  for  the  policy  with 
replacement  time  vector  (23580, 10300, 5700,  2666.67, 1000,  275). 

This  example  also  sheds  light  on  ways  to  reduce  the  computational  burden  of 

A 

finding  t :  it  is  often  unnecessary  to  compute  C  at  the  non-dominated  point  in  every 
H  e  H.  We  recommend  first  finding  the  unrestricted  minimizerT  .  A  basic  optimization 
principle  is  that  if  the  solution  of  a  relaxation  happens  to  satisfy  a  restriction,  then  it 
solves  the  restriction.  This  principle  implies  that  if  f  e  A,  then  f  =  T  .  Thus,  if  the 
unrestricted  minimizer  lies  in  the  set  A,  no  further  computation  is  necessary.  Computing 
T  can  save  computation  even  if  f  &  A.  In  some  cases,  T  may  violate  only  one  constraint 
defining  the  set  A;  restricting  the  coordinates  causing  the  violation  (while  leaving  the 
others  relaxed)  may  lead  to  an  optimal  solution.  More  specifically,  suppose  that 
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f  =  (T1,T2,...,rm)  is  such  that  for  some  k in  1},  either  zk  <zk+l  or 

0k*k  Let  K  and ?;+1  minimize 

Cjt  (Lt > T*+i ) =  Ck  (t,  )  +  C*+1  (^i+i )  Pk+ 1  > 

subject  to 

A*=  {(n,^+i)  e  (0,°°)2:  r*>  t*+i,  6kzk<  dMzM}. 

Let  ?'  denote  the  vector  formed  by  replacing  xk  and  zk+l  in  f  with  fk  and  zk+] , 
respectively.  It  can  be  shown  that  if  f '&  A,  then  f-  f'.  This  approach  works  for  the 
metal  data  for  r  =  0.5;  recall  from  Case  Study  3  that  f  violates  one  constraint  defining 
set  A.  This  approach  applies  sequentially  on  the  metal  data  for  r  =  0.75;  in  this  case  T 
violates  two  constraints. 

E.  COMPARISON  WITH  SCALE-COMBINING  APPROACHES 

The  scale-combining  methods  discussed  in  Chapter  III  differ  fundamentally  from 
our  estimation  procedure  in  their  motivation,  but  in  some  cases  produce  sensible  policies. 
The  “best  scale”  method  seeks  the  linear  time  scale  t(a )  =  (1-  a)x  +  ay(x),  a  e  [0,1],  with 
corresponding  r(<s)-scale  replacement  time  Ta,  that  yields  the  lowest  long-run  average  cost 
(per  unit  of  chronological  time,  after  “conversion”).  The  min  CV  method  seeks  the  linear 
scale  corresponding  to  the  smallest  lifetime  CV.  Both  of  these  procedures  use  the  data  to 
produce  a  linear  time  scale  and  hence  a  policy  of  the  form  Mx,  My,  or  a  triangular  set 
Ma  =  {(x,y(x)):  t(a)  <Ta}.  In  contrast,  the  policies  produced  by  our  procedure  are 
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required  only  to  be  lower  sets.  This  is  a  broader  class  of  policies  than  those  resulting 
from  a  linear  scale. 

Ideal  time  scale  methods  seek  the  scale  t  such  that  P[T  >t0\Z\  does  not  depend 
on  the  path  Z;  hence,  a  policy  based  on  an  ITS  has  the  property  that  the  probability  of 
failure  before  replacement  in  this  scale  is  the  same,  regardless  of  the  path.  Most  of  the 
focus  of  Duchesne  (1999)  is  on  inference  procedures  for  the  parameters  of  ITS  models 
which  are  either  linear  (i.e.,  t  =  x  +  gy(x),  g  >  0)  or  multiplicative  (i.e.,  u  =  x1'71  y(x)n, 

0<  7]<l).  In  the  case  of  linear  paths  with  slopes  $e  { 6\,...,6m},  these  scales  always 
result  in  sensible  policies.  To  demonstrate  this,  suppose  the  data  are  reasonably 
described  by  a  linear  ITS  model  t  =  x  +  gy{x).  The  “best”  scale  for  age  replacement  and 
min  CV  scale  can  be  re-parameterized  to  this  form.  The  policy  takes  the  following  form: 
replace  non-failed  devices  when  x  +  gy(x)  =  f .  It  follows  that  the  replacement  time 
vector  is  ( f  /(I  +  gOi),. . .,  f  /(I  +  gOm))  e  A.  Restricting  attention  to  preventive 
maintenance  policies  formed  by  ITS  models  may  be  appropriate  in  some  cases;  however, 
we  have  noted  in  Chapter  III  that  the  non-uniqueness  of  an  ITS  can  cause  problems  for 
estimation  of  age  replacement  policies  even  when  the  ITS  has  a  simple  parametric  form. 
Unfortunately,  given  a  set  of  lifetime  data  (along  linear  paths  or  otherwise)  it  is  rarely 
clear  which  (if  any)  parametric  form  the  ITS  should  take.  Duchesne  (1999)  suggests  a 
non-parametric  procedure  for  estimating  the  true,  unknown  ITS;  this  procedure  links  the 
quantiles  along  the  paths.  Policies  based  on  the  resulting  scale  can  be  constructed  which 
are  not  lower  sets. 
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In  Chapter  II  we  have  noted  that  in  the  single-scale  problem,  policies 
corresponding  to  a  sequence  of  decreasing  cost  ratios  are  “nested.”  We  have  also 
observed  that  this  quality  is  desirable  for  multiple-scale  policies  because  non-nested, 
multiple-scale  policies  prescribe  replacement  times  for  devices  on  some  paths  that  are 
inconsistent  with  respect  to  the  corresponding  cost  ratios.  We  have  also  observed  in 
Chapter  III  that  policies  based  on  either  the  min  CV  scale  or  on  an  ITS  are  nested,  but 
policies  based  on  the  “best”  scale  for  age  replacement  method  are  not  guaranteed  to  be 
nested.  Due  to  the  nature  of  the  single-scale  cost  function  (1.3)  and  in  turn  (4.5),  the 
policies  produced  by  our  procedure  are  not  necessarily  nested.  However,  we  show  in 
Chapter  V  that  in  practice,  our  procedure  tends  to  produce  nested  policies  even  with 
small  samples.  In  such  cases,  our  procedure  forms  a  time  scale  based  on  the  cost  ratio  r. 
The  points  along  each  path  corresponding  to  the  replacement  time  for  a  given  r  have  the 
same  age  in  this  scale.  Also,  in  a  manner  analogous  to  the  cost  sensitivity  analyses 
conducted  with  the  aid  of  TTT  plots,  we  find  there  are  ranges  of  r  over  which  the  same 
composite  policy  is  valid.  Combined  scales,  on  the  other  hand,  essentially  order  the 
observations  based  on  their  lifetimes  in  the  combined  scales;  points  along  contours  of 
these  scales  are  the  same  “age”  in  these  scales,  indicating  they  have,  in  a  sense, 
accumulated  the  same  level  of  damage. 

F.  DISCUSSION  AND  SUMMARY 

In  this  chapter  we  developed  a  method  of  estimating  the  optimal  “sensible”  policy 
given  lifetimes  from  a  population  of  devices  which  age  along  linear  paths.  Under  such  a 
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policy,  non-failed  devices  on  path  Z,  are  replaced  when  their  chronological  age  reaches  Xu 
i=  l, ,  m.  As  such,  this  composite  policy  technically  applies  only  to  devices  on  these 
paths.  Policies  based  on  combined  scales  of  the  form  considered  in  Chapter  III  have  this 
same  form  when  applied  to  data  on  linear  paths.  The  assumption  that  devices  age  exactly 
along  linear  paths  is  usually  an  approximation  of  reality;  thus,  it  is  worthwhile  to  consider 
ways  to  extend  these  policies  to  ones  that  apply  to  devices  on  any  path.  The  policy  (0,r) 
in  a  combined  scale  t  extends  in  a  natural  way  to  the  region  {(x,y(x)):  t  <  x)  in  the 
positive  quadrant,  as  exemplified  in  Figure  3.4  and  Figure  3.5. 

The  key  consideration  for  extending  the  policy  produced  by  our  estimation 
procedure  is  to  ensure  that  the  resulting  policy  is  a  lower  set  with  respect  to  the  matrix 
partial  order  on  (0,°°)2.  Consider,  for  example,  a  population  of  devices  aging  along  lines 
of  slope  0\  =  0.5,  (h  =  2,  or  <9?  =  8.  Suppose  that  for  some  r  >  0  the  replacement  times  are 
X\  =  20,  x~i  -  10,  and  x^  =  5,  respectively.  The  solid  lines  segments  in  Figure  4.2  represent 
the  failure  replacement  region  for  this  policy.  To  extend  this  policy  to  the  positive 
quadrant,  we  need  a  non-increasing  function  on  (0,°°)  that  is  contained  within  the 
rectangular  regions  delimited  by  the  dashed  lines  in  Figure  4.2.  This  function  induces  a 
boundary  of  the  failure  replacement  region;  non-failed  devices  are  replaced  when  their 
usage  curve  crosses  this  boundary.  A  “conservative”  extension  is  to  choose  a  step 
function  coincident  with  the  lower  boundaries  of  the  boxes;  a  more  “aggressive” 
extension  is  to  choose  a  step  function  coincident  with  the  upper  boundaries  of  the  boxes 
(in  this  case  there  is  no  usage  limit  for  devices  with  x  <  5).  Between  the  two  extremes, 
we  arbitrarily  choose  a  smooth  curve  through  the  policy  points  {(20,10),  (10,20),  (5,40)}, 
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as  depicted  in  Figure  4.2.  We  address  the  problem  of  determining  the  cost  of 
implementing  such  policies  in  Chapter  VI. 


Figure  4.2:  Extension  of  Estimated  Optimal  Policy. 

The  solid  lines  represent  the  failure  replacement  region  for  the  policy  with 
replacement  time  vector  (20, 10,  5),  The  dashed  lines  represent  bounds  for 
a  non-increasing  function  serving  as  a  policy  boundary  under  the  lower  set 
restriction.  The  smooth  curve  represents  the  boundary  of  one  possible 
extension  of  the  policy  based  on  the  linear  paths  of  slope  0.5,  2,  and  8. 


Additionally,  we  note  that  our  focus  in  this  chapter  has  been  on  completely  non- 
parametric  estimation  of  the  optimal  policy.  We  acknowledge  it  is  also  possible  to 
estimate  the  F,  under  the  restriction  that  the  estimates  be  IFR.  Ingram  and  Scheaffer 
(1976),  however,  find  little  value  added  from  the  increased  computational  burden  over 
empirical  estimation.  We  remark  that  if  parametric  (or  other  nonparametric)  distributions 
are  fit  to  each  F,-  and  a  r,  estimating  tj*  is  found  for  a  given  r  >  0,  the  vector 
(f,  ,t2  )  is  not  necessarily  in  A.  It  is  possible,  however,  to  estimate  parameters  of 
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certain  collections  {F,}  under  the  restriction  that  (?,  ,f2  )  be  in  A.  Gertsbakh  and 

Kordonsky  (1998)  consider  an  example  of  such  a  collection.  They  discuss  estimation  in 
the  Weibull  family  under  which  the  shape  parameter  is  constant  for  all  paths  but  the  scale 
parameters  are  allowed  to  vary.  Geurts  (1983)  acknowledges  optimal  age  replacement 
times  in  the  Weibull  family  are  relatively  insensitive  to  the  shape  parameter,  so  in  our 
setting  this  seems  to  be  a  reasonable  approach.  In  such  a  case,  it  can  be  shown  that  if  the 
scale  parameters  satisfy  conditions  akin  to  (4.3),  the  resulting  composite  policy 
(f, ,  f2 ,...,  Tm )  is  in  A.  General  conditions  under  which  (?, ,  ?2 ,...,  fm )  is  in  A  need  further 
study. 

Finally,  in  this  chapter  we  focus  on  linear  paths  in  two  scales.  The  concepts 
developed  here  can  be  generalized  to  more  than  two  scales.  For  example,  m  linear  paths 
in  k+ 1  scales  can  be  represented  by  (x,  y  ]  (x), . . . ,  y*(x))  where  y/x)  =  0tJx,  i=  1 , . . .  ,m,  j  = 

1  For  m  such  paths,  as  in  two  scales,  an  age  replacement  policy  need  only  specify 
replacement  ages  ( T\,T2,...,Tm )  in  chronological  time.  In  addition,  the  cost  function  (4.2) 
remains  the  same.  The  difference  comes  in  specifying  constraints  (4.3)  so  that  the  policy 
(Ti,T2,...,Tm)  is  indeed  sensible  in  the  original  scales.  We  speculate  that  the  constraints 
and  the  estimator  will  have  form  similar  to  those  developed  in  this  chapter.  We  do  note, 
however,  that  it  is  difficult  to  imagine  practical  applications  of  extending  these  results  to 
more  than  three  scales. 
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V.  PROPERTIES  OF  THE  ESTIMATED  OPTIMAL  COMPOSITE  POLICY 

In  this  chapter  we  address  the  properties  of  the  policy  t .  We  begin  with  a 
discussion  of  its  large-sample  properties,  and  then  investigate  its  small-sample  behavior 
through  simulation.  We  conclude  with  simulation  results  aimed  at  comparing  the 
performance  of  the  policies  produced  by  our  procedure  with  those  based  on  the  min  CV 
method. 

A.  LARGE-SAMPLE  PROPERTIES 

Let  S(.  be  a  uniformly  strongly  consistent  estimator  of  Si,  i  =  1  For 

A 

example,  if  lifetimes  along  path  i  are  from  a  simple  random  sample,  then  taking  S(  to  be 

the  empirical  survivor  function  (1.2)  gives  a  non-parametric  estimator  of  S„  which  by  the 
Glivenko-Cantelli  lemma  converges  uniformly  to  5,  with  probability  1.  On  the  other 
hand,  should  lifetimes  along  path  i  be  right-censored,  depending  on  the  censoring 

mechanism,  the  Kaplan-Meier  estimator  is  an  appropriate  choice  for  Sf .  With  such  an 
estimator  and  the  assumption  that  t\*  <  ©°  exists  and  is  unique  (e.g.,  if  F,  is  IFR  with 
failure  rate  strictly  increasing  to  °°)  then  it  is  well  known  (e.g.,  Arunkumar,  1972)  that  Ti 

minimizing  C^tf)  is  a  strongly  consistent  estimator  of  %*.  From  this  it  follows,  for  the 
composite  policy  with  replacement  time  vector  f  =  (t, , . . . ,  Tm ) ,  that 

max  It,  —  t,*|  — »  0 

1</<OT 1  '  '  1 
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with  probability  1  as  n,  — >  °°,  for  all  i  =  1,  ...  ,  m.  This  result,  of  course,  does  not  require 
the  estimated  policy  r  to  be  in  A  even  if  (Ti*, ...  ,  Tm*)  is  in  A. 


The  showing  of  the  strong  consistency  of  the  estimator  f ,  which  is  required  to  be 

A 

in  A,  takes  a  bit  more  care.  With  the  individual  Ti*  <  °°  and  unique,  and  Sl  a  uniformly 

strongly  consistent  estimator  of  5,-,  then  a  small  modification  of  Ingram  and  Scheaffer’s 

(1976)  argument  shows  that  C(.  converges  uniformly  to  C,  in  an  interval  bounded  away 

from  zero  with  probability  1 .  In  particular,  Ingram  and  Scheaffer  (1976)  show  that 

Jo  Sj  ( u)du  <  °°  by  appealing  to  the  condition  that  F,  be  EFR.  However,  this  is  also  true  if 

Ti*  <  °°  and  unique  because  for  t>  Ti*,  0  <  C,(tj*)  <  C,{t)  and  hence 

S:  ( u)du  -(K  +  C)  lim(l /  C.  (t))  <  °° .  For  the  multiple-scale  functions  C(t)  and  C  (T ), 
Jo 


we  have 


C(T)-C(T)  = 


Thus,  for  a  >  0  we  see  that  C  (r)  converges  uniformly  to  C(t)  in  the  m-dimensional 

region  Suppose  T<£  [a,°°)m,  so  that  Tj  <  a  for  some  j  =  1  Then  C( T ),  and 

similarly  C  (t),  are  bounded  below  as  follows: 

C(T)>  PjCjitj) 

(. K  +  O-C 


ZPI 


Jo  Sj(u)du 


.  K  ■ 

>  —  mm  D:. 

d  \<i<m 
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An  application  of  the  multivariate  analog  of  Theorem  1  of  Arunkumar  (1972,  p.  252)  then 
gives  strong  consistency  of  f ,  as  an  estimate  of  r*,  as  stated  in  the  following  theorem. 

Theorem  5.1.  Let  r  >  0,  (Ti*, ...  ,  %n*)  e  A  be  unique,  where  A  is  defined  in 
(4.3),  ti*  <° o  ,  and  5f  be  a  uniformly  strongly  consistent  estimator  of  5„  i=  1,  ...  ,  m. 
Then 

max  If.  -r,*|  ->0 

l<i<m 1  '  1 

with  probability  1  as  each  n,-  i  =  1, ...  ,  m. 

We  note  that  the  proof  of  Theorem  5.1  does  not  require  f  to  be  unique.  Indeed 
with  St  as  the  empirical  survivor  function,  uniqueness  of  f  is  not  guaranteed.  In 
addition,  although  (Ti*, ... ,  r,„*)  e  A  for  most  practical  cases,  this  is  not  a  strict 
requirement.  What  is  required  in  the  proof  of  Theorem  5.1  is  the  existence  of  a  unique 
minimizing  C(t)  among  Te  A  and  that  z*  has  finite  elements.  Weak  convergence  of  f  is 
not  studied  here.  Arunkumar  (1972)  does  develop  the  asymptotic  distribution  of  the 
minimizer  of  (1.3)  in  the  one-dimensional  case.  Perhaps  Arunkumar’ s  approach  can  be 
used  to  establish  weak  convergence  for  the  multi-dimensional,  restricted  estimator  f . 

Furthermore,  for  large  samples,  the  estimators  of  the  optimal  policies  are  nested. 
Suppose  s  <  r,  and  let  Tj*(s)  and  t;*(r)  minimize  C,(r;)  with  respective  cost  ratios  s  and  r, 
i=  1, ... ,  m.  If  (Tt*(s), ....  Tm*(s))  -<  (Ti*(r), ... ,  Tm*(r)),  the  corresponding  failure 
replacement  regions  are  nested.  Suppose  both  (Ti*(.y), ... ,  Tm*(s))  and 
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(n*(r), are  in  A  and  -Zi*(s)  <  Tj*(r),  i=  1, ...  ,  m.  Then,  it  follows  from 
Theorem  5.1  that  with  probability  1,  for  all  n\,  n2,  ...  ,  nm  large  enough,  the  estimated 
policies  f(s)-<  i  ( r ),  and  thus  their  corresponding  failure  replacement  regions  will  be 
nested. 

B.  SMALL-SAMPLE  BEHAVIOR 
1.  General  Simulation  Results 

We  use  simulation  to  gain  insight  into  the  behavior  of  the  estimated  cost  function 
and  policy  for  small  sample  sizes.  In  this  simulation,  devices  have  “low,’  medium,  or 
“high”  rates  of  use,  corresponding  to  usage  paths  of  slope  =  1,  &i  =  2  or  (h  =  5.  For 
each  path,  lifetimes  arise  from  the  Weibull  distribution,  with  density 

R(t\M  (  ftY) 

/(/;/?,  9>)  =  —  —  exp - ,t>  0.  (5.2) 

<p{<p)  {*) 

As  in  the  simulations  of  Ingram  and  Scheaffer  (1976)  we  fix  the  shape  parameter  (3  =  2 
for  each  path.  Gertsbakh  and  Kordonsky  (1998)  also  assume  the  Weibull  shape 
parameter  is  constant  over  paths.  The  scale  parameter  (p  is  varied  for  the  three  paths  so 
that  (px  =  40/21,  (p2  =  10/7,  (pi  =  1  for  paths  1,  2,  and  3,  respectively.  These  scale 

parameters  ensure  (Ti*,T2*,?3*)  lies  in  A  for  any  r  >  0. 

Four  groups  of  simulations  are  performed  to  investigate  the  small-sample 

behavior  of  C  (t)  and  f  as  sample  sizes  along  paths  n  =  («i,  n$),  mixing  probabilities 

P  =  (pu  Pi,pi),  and  cost  ratio  r  vary.  Each  group  corresponds  to  realistic  settings  for  n 
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and p.  There  are  three  runs  within  each  group,  to  investigate  the  effects  of  varying  r. 
Table  5.1  depicts  the  settings  used  in  each  run. 


Run 

n 

p 

r 

Group  1 

1 

"  (5,5,5) 

(1/3, 1/3, 1/3) 

1.0 

2 

(5,5,5) 

(1/3, 1/3, 1/3) 

0.5 

3 

(5,5,5) 

(l/3,l/3,l/3) 

0.1 

Group  2 

4 

(5,5,5) 

(0. 1,0.8, 0.1) 

1.0 

5 

(5,5,5) 

(0.1, 0.8, 0.1) 

0.5 

6 

(5,5,5) 

(0.1, 0.8, 0.1) 

0.1 

Group  3 

7 

(10,10,10) 

(1/3, 1/3, 1/3) 

1.0 

8 

(10,10,10) 

(1/3, 1/3, 1/3) 

0.5 

9 

(10,10,10) 

(1/3, 1/3, 1/3) 

0.1 

Group  4 

10 

(10,10,10) 

(0.1, 0.8, 0.1) 

1.0 

11 

(10,10,10) 

(0.1, 0.8, 0.1) 

0.5 

12 

(10,10,10) 

(0.1, 0.8, 0.1) 

0.1 

Table  5.1 :  Settings  for  General  Simulation  Runs 


Sample  sizes  of  5  and  10  are  common,  particularly  in  observational  data  or  experiments 
designed  to  study  the  lifetime  of  high-cost  prototypic  devices.  Mixing  probabilities 
(1/3,  1/3,  1/3)  represent  populations  for  which  devices  are  evenly  spread  across  several 
usage  rates;  and  mixing  probabilities  (0.1, 0.8, 0.1)  represent  populations  for  which  a 
large  majority  of  the  devices  have  a  “medium”  rate  of  use  (e.g.,  automobiles).  Table  5.1 
contains  runs  for  which  the  relative  frequencies  of  the  sample  sizes  along  paths  differ 
from  the  mixing  probability  vector  since  it  is  not  uncommon  for  the  mixture  of  test  assets 
to  differ  from  the  mixture  in  the  actual  population.  Finally,  the  cost  ratios  1, 0.5,  and  0.1 
are  common  in  the  literature. 

Each  run  of  the  simulation  consists  of  200  replications.  In  replication  j,  we 
generate  a  data  set  consisting  of  n,  Weibull(2,$)  lifetimes,  i  =  1,2,3  and  for  this  data  set 
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we  find  f  w  corresponding  to  the  given  p  and  r  using  the  procedure  described  in  Chapter 
IV.  The  random  number  seed  is  set  in  advance  for  replicability.  For  each  run,  we 
compute  several  quantities  to  gain  insight  into  the  small-sample  performance  of  f  as  an 
estimator  of  z*.  Table  5.2  contains  z*,  the  minimizer  of  the  true  cost  function  C(z), 

found  numerically.  It  also  lists  Av(f )  =  (1/200)^  ^  fU) ,  an  estimate  of  the  expected 
value  of  f  and  the  difference  Av[i )-  z*,  an  estimate  of  the  bias  of  f .  Finally,  it 
includes  p(t ) ,  the  proportion  of  the  replications  for  which  f  =  (t,  ,  f  2 ,  T3 ) ;  this  quantity 

reveals  how  often  f  e  A  and  hence  we  find  i  “automatically,”  with  minimal 
computation. 


T* 

1  M*) 

Av(f)-  z* 

P(f) 

l 

2.078 

1.558 

1.091 

2.005 

ism 

2 

1.406 

1.054 

0.738 

1.471 

1.048 

0.713 

ItiKifrfdB 

wmm 

urn 

H 

0.607 

0.456 

0.319 

0.866 

0.407 

0.258 

0.151 

0.088 

0.210 

4J 

mm 

2.180 

KES3 

0.973 

wsnsM 

0.225 

K3 

IfiSf 

e ssa 

1.607 

mm 

0.706 

MBSM 

0.260 

B 

0.634 

0.424 

0.316 

0.179 

0.105 

0.210 

7 

1.558 

2.121 

1.545 

1.035 

0.044 

-0.013 

-0.056 

0.250 

8 

1.466 

WBB81 

0.061 

0.010 

0.009 

0.330 

B 

0.146 

0.082 

0.038 

0.225 

us 

1.558 

1.601 

HIM 

0.223 

0.250 

ii 

1.406 

1.054 

0.738 

1.514 

0.108 

gaBwa1 

EES§] 

12 

0.607 

0.456 

0.319 

0.768 

ItXitiBI 

0.095 

Table  5.2:  Small-Sample  Performance  of  f 


First,  by  comparing  rows  1-6  with  rows  7-12  in  Table  5.2,  we  note  that  increasing 
sample  sizes  generally  results  in  an  increase  in  the  (estimated)  accuracy  of  f .  As 
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expected,  increasing  sample  sizes  increases  the  proportion  of  replications  for  which 
T  =  (t,  ,  f  2 ,  ?3 ) .  To  investigate  the  effect  of  a  non-uniform  p  on  f  in  small-sample 
situations,  compare  rows  1-3  with  rows  4-6  and  rows  7-9  with  rows  10-12.  In  general, 
the  accuracy  of  f  decreases  slightly,  but  this  effect  is  reduced  as  the  sample  sizes 
increase.  By  examining  columns  4-6  of  the  rows  within  each  group,  we  note  the 
“average”  policies  are  nested. 

We  proceed  as  follows  to  determine  if  the  policies  produced  in  each  individual 
replication  of  a  given  run  are  nested.  By  setting  the  random  seed,  we  generate  the  same 
lifetimes  for  each  run  in  the  first  two  groups  and  in  the  last  two  groups.  Hence,  for 
example,  the  estimated  policies  for  replication  j  of  runs  1,  2,  and  3  are  based  on  the  same 
random  numbers.  For  a  fixed  group,  let  f  ^\r)  denote  the  estimated  policy  for  cost  ratio  r 
given  the  data  for  replication  j.  It  can  be  shown  that  these  policies  are  nested  if 
f  W( o.  1 )  -<  f  O)(0.5)  -c  f  0)(1).  For  each  of  the  four  groups,  we  find  that  nesting  occurs  in 
each  of  the  200  replications. 

For  each  run,  we  also  compute  several  quantities  to  gain  insight  into  the  small- 
sample  performance  of  C  (t)  as  an  estimator  of  the  true  cost  C(r).  First,  we  compute 
C(?*),  the  exact  cost  of  the  true  optimal  policy,  from  (4.2).  Next,  we  compute 
Av[C(f)]  =  (1  /  200)^2(X|  CU)  (t(i) ) ,  an  estimate  of  the  expected  estimated  minimum  cost 

of  age  replacement,  and  then  the  sample  standard  deviation  of  the  C  (f ).  We  also 
compute  Av[c(f)]  =  (1/200)  an  estimate  of  the  expected  true  cost  at  the 

optimal  policy.  Finally,  we  compute  b[C(f)]=  Av[C(f)]-  Av[C(f)]  and 
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MSe\c(t  )]  =  (1  /  200)^T™  (c<j)  (f 0) )—  c(f<J) ))  ,  estimates  of  the  bias  and  MSE  of  C  (f) 


as  an  estimator  of  C(f ),  respectively.  These  quantities  are  scaled  by  the  factor  1/C  and 
displayed  in  Table  5.3. 


C(t*) 

Av[C(f)] 

b[C{f)] 

MSE[C(f)] 

1 

|  1.618 

— 

0.272 

1.679 

-0.199 

0.107 

2 

pPMMI 

1.150 

-0.246 

0.094 

3 

rn'imm 

0.518 

-0.247 

0.074 

in 

1.554 

0.351 

WBI 

0.142 

5 

0.249 

mam 

6 

0.145 

1 

7 

0.052 

8 

|| 

— WEE— 

-0.173 

0.049  ' 

9 

mew 

■SB 

|| 

-0.185 

0.042 

10 

1.554 

1.465 

0.242 

mmm 

mmm 

hhh 

11 

1.052 

0.178 

mmm 

wsnm 

1 

12 

0.454 

0.300 

0.115 

0.490 

-0.191 

0.047 

Table  5.3:  Small-Sample  Performance  of  C  (f ) 


As  in  Table  5.2,  by  comparing  rows  1-6  with  rows  7-12  of  Table  5.3,  we  note 
that  increasing  the  sample  sizes  results  in  an  increase  in  the  (estimated)  accuracy  and 
precision  of  C  (r)  as  an  estimator  of  C(t).  To  investigate  the  effect  of  a  non-uniform p 
on  C  (t),  compare  rows  1-3  with  rows  4-6  and  rows  7-9  with  rows  10-12.  In  general,  the 
accuracy  and  precision  of  C  (r)  decreases  slightly,  but  this  effect  is  reduced  as  the 
sample  sizes  grow. 
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2.  Results  of  Nesting  Simulation 


We  also  use  simulation  to  investigate  in  more  detail  the  nesting  tendency  of  the 
policies  produced  by  our  procedure.  In  the  general  simulation,  we  used  the  sequence  of 
cost  ratios  { 1,  0.5,  0.1 };  in  this  simulation  we  use  a  more  refined  sequence  {1,  0.9, 

...,0.1 }.  We  retain  the  same  slopes  and  Weibull  parameters  as  in  the  general  simulation. 
The  nesting  simulation  consists  of  4  runs  of  20  replications  each;  for  each  replication  we 
use  a  new  random  number  seed.  To  investigate  the  effect  of  sub-sample  size  and  mixing 
probability  on  nesting,  we  vary  n  and p  between  runs.  The  settings  for  n  and p  for  the 
four  runs  coincide  with  the  settings  in  groups  1-4  in  Table  5.1  (i.e.,  run  1  has  the  same 
settings  as  in  Group  1,  and  so  on).  In  each  replication  of  a  given  run,  we  generate  n, 
Weibull(2,$)  lifetimes,  i  =  1,2,3;  for  this  data  set  we  find  i  ®(r)  for  each  r  in  {1,  0.9, 
...,0.1 }  and  we  check  whether  f  ^(O.IH  f  -<  f  0)(1).  For  each  run,  we  find 

that  nesting  occurs  in  each  of  the  20  replications. 

C.  COMPARISON  WITH  MIN  CV  METHOD 

We  further  use  simulation  to  gain  insight  into  the  performance  of  composite 
policies  estimated  using  our  procedure  with  in  comparison  with  composite  policies 
estimated  using  the  min  CV  procedure.  Here,  we  compare  the  true  costs  of  the  policies 
produced  by  the  two  procedures  using  the  sample  sizes,  mixing  probabilities,  and  cost 
ratios  contained  in  Table  5.1.  As  in  the  general  simulation,  we  use  devices  with  usage 
paths  of  slope  6\  =  1,  (h  =  2,  or  6$  =  5  and  that  X  I#  ~  Weibull(2,  (pi),  i  =  1,2,3.  But  in 
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this  simulation,  the  scale  parameters  #>,  correspond  with  distributions  for  which  the  min 
CV  method  is  expected  to  return  reasonable  estimates  of  (Ti*,T2*,T3*). 

Unlike  our  procedure,  the  min  CV  method  is  not  designed  specifically  for  the 
purpose  of  estimating  (Ti*,T2*,T3*).  Nonetheless,  for  certain  families  of  conditional 
distributions,  the  policy  based  on  the  min  CV  method  does  in  fact  estimate  (Ti*,T2*,T3*). 
Consider,  for  example,  a  population  of  devices  on  linear  usage  paths  Z  whose  lifetimes 
correspond  to  the  model 


That  is,  devices  have  lifetimes  corresponding  to  the  linear  ITS  model  with  time  scale 
parameter  y,.  The  times  in  the  ITS  have  a  Weibull  distribution  with  shape  parameter  (5 
and  scale  parameter  #>  (ex:  Duchesne  and  Lawless,  2000).  It  can  be  shown  that  along 
paths  we  haveX  \6~  Weibull(/?,#V(1  +  y0Q))-  Suppose  J3=  2,  #>  =  4,  and  y,  -  3/5.  It 
follows  that  X 1 6i  ~  Weibull(2,#>,)  where  (p\  =  2.5,  (pi  =  20/11,  and  #>3=1;  these  scale 
parameters  are  used  throughout  the  study.  These  scale  parameters  ensure  (Ti*,T2*,T3 *) 
lies  in  A  for  any  r  >  0. 

For  a  given  r  >  0,  our  procedure  always  returns  a  policy  with  lower  estimated  cost 
than  any  other  policy  in  (4.3).  But  since  the  true  t*  in  this  simulation  corresponds  to  a 
triangular  policy,  and  min  CV  restricts  attention  to  such  policies,  we  would  expect  the 
policy  based  on  the  min  CV  scale  to  have  lower  actual  cost  than  our  estimated  policy. 

We  find,  though,  that  our  procedure  compares  favorably  in  terms  of  true  costs  also.  The 
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12  runs  of  this  simulation  use  the  n,p,  and  r  as  described  in  Table  5.1;  each  run  of  the 
simulation  consists  of  200  replications.  In  a  given  replication,  we  generate  n,  lifetimes 
from  Weibull(2,$),  i  =  1,2,3.  From  this  data  set  we  compute  a ,  resulting  in  f  Cv,  the 
policy  produced  by  the  min  CV  method.  We  also  compute  f  using  our  method.  Hence, 
the  result  of  each  run  are  pairs  ( f y),j  =  1, ,  200.  For  each  run,  we  compute 
C(t)  at  each  of  these  values  and  (due  to  occasional  non-normality)  perform  a  Wilcoxon 
signed-rank  test  on  the  differences  C( fg )  -  C( iU) )J=  1, ...  ,200.  For  every  run  we 

reject  the  null  hypothesis  that  the  true  mean  difference  is  non-positive;  approximate 
/7-values  are  0  in  each  case.  In  fact,  our  estimator  results  in  a  lower-cost  policy  in  67%  to 
85%  of  the  200  replications  for  each  run. 
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VI.  POLICIES  GIVEN  DATA  FROM  UNKNOWN  USAGE  PATHS 


Assume  that  (X,Y)  has  support  X  =  (O,^)2  and  that  usage  paths  are  unknown. 
Unlike  the  setting  with  known  linear  usage  paths,  there  is  no  natural  way  to  write  the  cost 
function  in  terms  of  one-dimensional  cost  functions  and  still  be  able  to  compute  the  cost 
for  any  policy  M  in  Mx.  Approaches  that  use  combined  scales  reduce  the  cost  function  to 
a  one-dimensional  cost  function  in  the  combined  scale,  but  they  do  so  by  restricting 
policies  to  classes  of  nested  policies.  Combined  scale  approaches  do  not  lend  themselves 
to  comparison  of  policies  that  are  not  nested.  In  this  chapter,  we  develop  a  cost  function 
that  is  a  natural  generalization  of  the  one-dimensional  cost  function  (1.1)  and  can  be 
applied  to  all  policies  in  Mx. 

In  the  single-scale  problem,  the  cost  function  (1.1)  has  the  interpretation  “long- 
run  average  cost  per  unit  of  time  in  use,”  and  arises  in  a  relatively  natural  way  from 
univariate  renewal  theory.  Under  a  joint  model  for  (A,  F)-  it  seems  reasonable  to  consider 
a  cost  function  of  the  same  nature  as  (1.1),  with  interpretation  “long-run  average  cost  per 
unit  of  time  in  use,”  where  “time  in  use”  can  be  measured  in  chronological  time  or  usage 
(e.g.,  flight  hours  or  landings).  In  practice,  budgets  are  often  made  with  respect  to 
chronological  time,  rather  than  usage.  With  this  in  mind,  the  cost  function  we  develop 
has  dimension  cost  per  unit  of  use  in  chronological  time.  It  does,  however,  incorporate 
both  scales  and  could  easily  be  taken  to  be  cost  per  unit  of  usage. 

As  in  previous  chapters,  we  consider  policies  M  in  Xtx  under  which  a  device  is 
replaced  upon  failure  or  when  its  usage  path  crosses  the  boundary  of  M,  whichever 
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occurs  first.  We  develop  the  two-dimensional  renewal  reward  process  as  the  foundation 
on  which  we  base  the  cost  function  for  policies  in  Mx.  For  a  given  set  of  failure  times 
we  then  demonstrate  how  to  estimate  an  optimal  rectangular  policy  in 
Mx,  and  conclude  with  an  example. 

A.  THE  TWO-DIMENSIONAL  RENEWAL  REWARD  PROCESS 

The  cost  function  that  we  develop  arises  from  considering  renewal  reward 
processes  (see  Appendix  A)  in  two  dimensions.  Let  R(w,v)  denote  the  rectangle 
[0,w]  x  [0,v]  and  u  >  0,  v  >  0.  A  stochastic  process  {N(u,v)',  u  >  0,  v  >  0}  is  said  to  be  a 
two-dimensional  counting  process  if  N(u,v )  represents  the  total  number  of  events  that 
have  occurred  in  R(w,v).  Let  {(t/,,V,)}  be  a  sequence  of  independent  and  identically 

distributed  (iid)  non-negative  random  vectors,  and  let  S(ni}  =  I/(.  and  S(2)  =  ^"=|  V'  . 

Define  N(u,v )  =  max{n:  S <  u ,  S(2)  <  v  }.  Then  { N(u,v );  u  >  0,  v  >  0}  is  also  a  two- 
dimensional  renewal  process  (e.g.,  Hunter  1974a).  Both  {£/,}  and  {V,  }  define  univariate 
renewal  processes.  With  N(uy>  =  max{n:  S(nr>  <u}  and  N{2)  =  max{n:  S(n2)  <  v  },  it  is 
readily  seen  that  N(u,v )  =  min{  N{J\  N{2) } .  Let  Rn  denote  the  reward  earned  at  the  nth 
renewal.  Assume  the  Rn,n>  1  are  iid;  note  Rn  may  depend  on  (Un,Vn).  Let 
Z(m,v)=  ^l^Rn  represent  the  total  reward  earned  in  R (w,v).  Then  (Z(m,v);  u  >  0, 

v  >  0}  is  a  two-dimensional  renewal  reward  process. 

Now,  we  generalize  the  univariate  Renewal  Reward  Theorem.  Let 
fi\  =  £[£/i]  <  °°  and  pti  -  E[V\\  <  suppose  also  E[/?i]  <  °°.  Given  a  one-dimensional 
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renewal  process  {N(t);  t  >  0}  with  mean  inter-renewal  time  ju,  it  is  well  known  that  the 
total  number  of  renewals  N(°°)  is  infinite  (e.g.,  Ross,  1997).  For  a  two-dimensional 
renewal  process,  let  N(  )  be  the  number  of  renewals  in  a  square  of  infinite  size;  that 
is,  N(  co,oo )  =  Hm  N(t,  t ) .  We  show  that  N(  )  cannot  be  finite. 

f_»oo 


Lemma  6.1:  N(  )  =  ®o  with  probability  1. 

Proof:  This  proof  is  a  generalization  of  Ross’s  (1997,  p.  353)  proof  for  the  one¬ 
dimensional  case. 


P{n(°°,  oo)  <  co}=  P{X  n  =°o  or  Yn=oo  for  some  n\ 


=  P 


Q{X,=~  or  K,=~} 


«=  1 


or  Y„  =~}=0. 


n= 1 


The  result  follows  by  complementation. 


Given  a  renewal  process  {N(t)\  t  >  0}  with  mean  inter-renewal  time  //,  it  is  also 
well  known  that  limA^(t)//  =  \fji ,  with  probability  1  (e.g.,  Ross,  1997).  That  is,  the  rate 

/  — >oo 

at  which  N(t)  goes  to  infinity  is  the  reciprocal  of  the  mean  inter-renewal  time,  with 
probability  1 .  The  following  result  considers  the  rate  at  which  a  two-dimensional 
renewal  process  goes  to  infinity. 

Lemma  6.2:  lim  N(t,t)/t  =  l/ma x{ju,,fi2 } ,  with  probability  1. 

t~>  °o  ^ 
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Proof:  For  any  fixed  t,  N(t,t )  =  min{v,(l),  V,(2) }.  Also,  for  fixed  t, 
min{iV,(1),  V,(2)}/t  =  min{w,(1)/? ,N\2) /t).  Since  lim N\x]/t  =  \/y.x  and  Jim N^/t  =  l/pt2 

with  probability  1,  it  follows  that  limmin{/Vf(1)/f, /t]  =  min{l j n2]  with 

t—>oo 

probability  1. 

Next,  we  generalize  the  Renewal  Reward  Theorem. 

Theorem  6.1:  limZ(f,f)/f  =  £[R1]/max{/w„/i2}  with  probability  1. 

f— 

Proof:  Decompose  Z(t,t)/t  as  the  product  of  ^  ^ Rlt  /N(t,t)  and  N(t,  t)/t .  By 

Lemma  6.1  and  the  Strong  Law  of  Large  Numbers  the  first  term  goes  to  £[Ri]  with 
probability  1.  By  Lemma  6.2  the  second  term  goes  to  l/ma x{//,,//2}  with  probability  1. 

B.  DEVELOPMENT  OF  COST  FUNCTION  FOR  TWO-SCALE  POLICIES 

We  must  modify  the  above  results  slightly  before  they  can  be  applied  to  the 
setting  in  which  the  components  of  the  two-dimensional  inter-renewal  times  { ( Vf) } 
measured  in  different  scales.  In  the  case  of  two  parallel  time  scales,  the  time  units  of  the 
mean  inter-renewal  times  in  the  denominator  are  not  directly  comparable.  However,  if 
we  “convert”  time  in  the  usage  scale  (e.g.,  landings)  to  chronological  time,  we  obtain  a 
meaningful  denominator.  To  this  end,  we  prove  a  corollary  to  the  theorem. 


Corollary  6.1:  For  a  >  0,  b  >  0,  lim Z{at,bt)/t  =  £[#,  ]/max{/<,  fa,/x2/b}  with 
probability  1. 

Proof:  From  {(Ui,V,})  form  the  new  renewal  process  {(W/,Z,)}  where  W,  =  UJa 
andZ,  =  mLet^  =  X;=I^=5;%  and  if  =  S?/b.  Let 

N{t,tj  ^msxin-.T®  Zt,T®  Zt).  As  E\W{]  =  fija  and  E[Vt]  =  njb,  we  have 
limN{t,t)  /t  =  \/max{nJa,H2/b}  with  probability  1  from  Lemma  6.2.  But 

N(t,t )  =max{n:5f1)  <at,  5*2)  <i>t} 

=  N(at,bt). 

This  line  of  reasoning  is  essentially  identical  to  Hunter’s  derivation  of  the  limiting  growth 
rate  of  E[N(at,bt)]  (1974b,  pp.  555-6).  The  result  follows  immediately,  using  this  fact 
and  the  decomposition  technique  from  the  proof  of  Theorem  6.1. 

Now  we  are  positioned  to  use  the  results  and  discussion  above  to  develop  the 
function  with  which  we  can  compute  the  cost  for  a  given  member  M  of  set  Mx.  Consider 
the  one-dimensional  case  in  which  a  device  has  lifetime  X  and  operates  under  the  age 
replacement  policy  (0,  T ).  Recall  the  interpretation  of  the  objective  function  (1.1).  the 
long-run  average  cost  per  unit  of  “time  in  use”  of  implementing  policy  (0,t).  Here,  the 
“time  in  use”  corresponding  to  lifetime  X  is  simply  the  replacement  time  min{X,T}. 

Now,  consider  the  two-dimensional  case  in  which  a  device  has  lifetime  (X,Y)  and 
operates  under  policy  M  £  T/j-.  We  seek  an  objective  function  with  a  similar 
interpretation,  but  now  “time  in  use”  is  more  problematic.  Let  (U,V)  denote  the 


77 


replacement  time  under  policy  M.  We  consider  two  cases.  First,  suppose  (X,Y)  sM. 

This  means  that  the  device  failed  before  crossing  the  boundary  of  M,  so  clearly 
( U,V)  =  ( X,Y ).  Thus,  its  “time  in  use”  is  U  =  X  and  V  =  Y,  and  its  two-dimensional 
replacement  time  is  simply  ( X,Y ).  Second,  suppose  (X,Y)  i  M.  We  know  the  device 
begins  its  life  at  (0,0).  As  it  ages,  it  traces  out  a  usage  path  terminating  at  (A,F),  which, 
by  assumption,  lies  outside  of  M.  At  some  point,  its  usage  curve  crossed  the  boundary  of 
M.  Had  policy  M  been  in  place,  its  “time  in  use”  in  both  scales  would  be  the  point  at 
which  the  usage  path  crossed  the  boundary  of  M.  But  by  assumption  we  only  know  (X,Y) 
and  M,  not  its  usage  path.  Since  usage  paths  are  often  approximated  by  a  straight  line,  we 
adopt  the  following  convention:  let  (U,V)  be  the  point  of  intersection  of  the  boundary  of 
M  and  the  chord  connecting  (0,0)  to  (X,Y).  We  describe  (U,V)  in  either  case  as  follows: 

U  =  sup  { x  <  X  :  (x,  (Y  /  X  )x)  e  M } ,  and 

(6.2) 

V  =  (Y  /  X)U. 

We  now  construct  the  two-dimensional  renewal  reward  process  for  a  device 
operating  under  policy  M  e  Mx,  We  are  given  two-dimensional  failure  times  (Zi,F|), 
(Xi,Y2),  ...  iid  from  some  bivariate  lifetime  distribution  F;  thus  {(£/; , V,) }  are  iid.  Let 
R(m,v)  denote  [0,w]  x  [0,  v].  Let  N(u,v )  represent  the  total  number  of  replacements  made 
in  R(m,v).  Since  the  {((/„V/)}  are  iid,  {N(u,v);  u  >  0,  v  >  0}  is  a  two-dimensional  renewal 
process.  As  in  the  one-dimensional  case,  let  the  “reward”  (i.e.,  cost  for  replacement)  R  be 
K  if  replaced  due  to  age  and  (K  +  C)  if  replaced  due  to  failure.  Let  Z(u,v)  represent  the 
total  cost  incurred  in  R(m,v).  Then,  (Z(w,v);  u  >  0,  v  >  0}  is  a  two-dimensional  renewal 
reward  process,  with  inter-renewal  times  {(t/„V,)},  rewards R,  =K  +  CI[ ( Xt ,Yt)e  M], 
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and  Z(u,  v)  =  .  Recall  the  cost  of  policy  (0,t)  in  the  one-dimensional  case  is 

C(t)=  \imZ(t)/t  =  E[Rl]/E[Ui],  as  discussed  in  Appendix  A.  To  obtain  a  similar 

f— >oo 

limiting  result  for  the  situation  we  have  just  described,  we  apply  Corollary  6.1.  Thus,  let 
a=l  and  b  =  E[Y]/E[X];  let  //i (M)  =  E[U]  and  ju2(M)  =  E[V].  From  Corollary  6.1, 

C(M )  =  \imZ(x,bx)/ x  =  ElR^/maxiii^M), iu2(M)/b} ,  (6.3) 

with  dimension  cost  per  unit  of  chronological  time.  The  coefficient  b  in  (6.3)  is 
motivated  from  the  “conversion  factor”  used  by  Kordonsky  and  Gertsbakh  (1994),  and 
can  be  interpreted  as  follows.  From  a  reliability  standpoint,  one  unit  of  usage  is  worth 
E[X\IE[Y]  units  of  chronological  time,  on  average. 

To  “solve”  the  multiple-scale  age  replacement  problem  in  this  setting,  we  must 
find  the  M*  in  Mx  which  minimizes  this  expression.  We  now  demonstrate  how  to  solve 
the  appropriate  optimization  problem  for  a  specific  subset  of  Mx. 

C.  FINDING  THE  BEST  RECTANGULAR  POLICY 

The  aim  of  this  section  is  to  search  over  the  set  Mr  =  {R(5,f):  s  >  0,  t  >  0},  the  set 
of  all  “lower  rectangular”  policies  (0,5)  x  (0,r).  Observe  Mr  c  Mx.  The  set  of  lower 
rectangles  is  attractive  since  rectangular  policies  are  easily  implemented:  a  device  is 
replaced  upon  failure  or  when  its  elapsed  chronological  time  or  cumulative  usage  reaches 
some  “limit.”  Hence,  rectangular  policies  are  closely  akin  to  automobile  warranties.  In 
this  section,  we  derive  the  form  of  the  cost  function  for  a  given  rectangle  and  describe  the 
minimizer  of  the  cost  function  formed  when  F  is  estimated  by  the  empirical  distribution 
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on  the  bivariate  data.  For  the  same  reasons  as  in  the  univariate  problem,  it  is  convenient 
to  define  F(x,y )  =  P(X  <  x,Y  <  y)  for  (x,y)  in  X.  We  now  calculate  the  numerator  and 
denominator  in  (6.3)  for  C(s,t),  the  cost  when  M  =  R (s,t). 

We  find  the  numerator  of  C(s,t )  in  a  manner  similar  to  the  single-scale  case. 
Define  reward  R  by 


R  = 


K  +  C  if  (X,Y)e  (0,s)x(0,t) 
K  if  (X,Y)£  (0,s)x(0,0 


(6.4) 


Thus,  the  numerator  is  £[/?]  ={K  +  Q  F(s,t )  +  K (1  -  F(s,t))  =  K+C F(s,t). 

To  compute  the  denominator,  let  jUi(s,t)  =  E[U]  and  Jii2(s,t)  =  E[V\,  where  U  and  V 
are  defined  as  in  (6.2).  For  a  fixed  ( s,t )  in  X,  let  A\(s,t)  =  (0,5)  x  (0,0, 


Aa(5,0  =  { (x,y)  e  X.  y>t  and  y  >  {t/s)x},  and  A^(s,t)  =  { (x,y)  e  X:  x  >  s  and  y  <  (t/s)xj . 


In  what  follows  the  parameters  ( s,t )  are  omitted  from  these  sets  to  simplify  notation. 


Observe  that  these  regions  form  a  partition  of  X.  From  (6.2),  we  find 


U  = 


\X,  if  (X,Y)<=Ax 
\tX!Y,  if  (X,Y)e  A2  . 


[s,  if  (X,Y)eA3 


(6.5) 


Thus, 


Mi  (-J.  0  =  JJ  xdF(x,  y)  +  JJ  (tx  /  y)dF(x,  y)  + 

A2 


jfsdF(x,  y). 

A3 


(6.6) 


Similarly, 


F  = 


Y,  if  (X,Y)e  Ax 

<t,  if  (X,Y)eA2, 

sY/X, if  (Z,F)e  A3 


(6.7) 
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and  it  follows  that 


ji2  (5,  t)  =  JJ  ydF(x,  y )  +JJ  tdF(x,  y)  +  JJ ( sy  /  x)dF(x,  y) .  (6.8) 

A  A  A 


Assembling  the  parts,  we  find  that  the  cost  when  M  =  R(s,r)  is 


C(s,t)  =  -  + 


ma  x{jul(s,t),ju1(s,t)/b] 


(6.9) 


When  F  is  estimated  by  a  discrete  bivariate  distribution  with  mass  p,  on  {(x;,y/), 
i  =  1, ... ,  n},  such  as  the  empirical  distribution,  C(s,t)  is  estimated  as  follows.  Let  ///) 
denote  the  indicator  function  on  set  Aj  for  i  =  1 , . . .  ,  n  and  j  in  1 ,2,3.  That  is, 


Jl,  if  (X; ,  y,- ) e  Aj 

7,(  0  =  irt 

[0,  otherwise 

Then,  it  can  be  shown  that  the  quantities  £[/?],  /J.i{s,t)  and  b  are  estimated  by 


(6.10) 


E[R]  =  K  +  Cj;i=1Il(0pi,  (6.11) 

fit  (*,  t )  =  i;=I  [*, 7,  (0  +  (tX'/y,  )/2  (0  +  sl3  (i)]p,  ,  (6. 12) 

fi2  (*»  0 = Xw  (0 + til  O') + (sy  A-  A  0)]/>,  ,  and  (6.13) 


We  substitute  these  into  (6.9),  obtaining 


C(j,0  =  - 


£[/?] 


(6.15) 


max{/i1(5,?),/i2(s,0/^} 

Let  us  now  explain  how  to  find  the  minimizing  value  of  (6.15).  Recall  that  to 
solve  the  one-dimensional  problem  it  suffices  to  evaluate  C  (r)  in  (1.3)  at  each  of  the 

A 

observations.  We  apply  a  similar  strategy  to  find  the  minimizer  of  C  (s,t).  Because  (1.3) 
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and  (6.15)  are  developed  in  a  similar  manner,  it  is  tempting  to  think  that  to  find  the 
minimizer  of  C ( s,t ),  it  suffices  simply  to  evaluate  (6.15)  at  (x„y,),  i  =  1, ... ,  n,  and  select 
the  two-dimensional  failure  time  with  the  smallest  cost.  Upon  closer  examination,  we 
find  that  it  is  necessary  to  evaluate  (6.15)  at  other  points  in  addition  to  the  two- 
dimensional  failure  times.  Let  z  be  such  a  minimizer,  i.e.,  C  (£)  <  C  (s,t)  for  all  (s,t)  in 
X.  We  now  describe  how  to  find  z . 

For  convenience,  suppose  that  no  chronological  failure  times  share  the  same 
value,  so  that  the  chronological  failure  times  can  be  strictly  ordered  x(1)  <  x(2)  < ...  <  x(n), 

and  similarly  suppose  the  usage  failure  times  can  be  ordered  y(1)  <  y(2)  <  •••  <  y <„>  •  Let 
*(0)  =  0  =  y(0)  and  x(n+1)  =  ~  =  y(K+t) .  Form  a  grid 

r={  Xjjj  ,  X(2)  ,•••>  }x{  y(1),y(2)»-,y(„)  }•  (6. 16) 

Note  r  defines  a  partition  of  =  (0,°°)2  into  rectangles  of  the  form  (X(/),X(/+i)]  x  (y(j)>y(j+i )]»  b 
je  {0, Let n(s,t)  =  E[R]  andd(sf)  =  max{fi,(s,t),  fi2(s,t)/b  }  from (6.11), 
(6.12),  (6.13)  and  (6.14).  Consider  the  numerator.  Note  that  n(s,t)  is  constant  on  every 
(X(,)pc(,+i)]  x  (y(fl,y<j+i)],  continuous  from  the  left  in  s  for  all  t,  continuous  from  the  left  in  t 
for  all  s,  and  non-decreasing  in  both  s  and  t  with  jumps  that  can  only  occur  on  the  north 
and  east  boundaries  of  the  (x(,)^C(/+i)]  x  (y(/),y(/+i)].  Consider  the  denominator.  We  have 

£,(5,0  =  I”, <7, MPi ,  where  qM  =  xt /,  (0  +  (tXi/yi )/2 (0  +  sl3 (/) .  It  can  be  shown 
that  qt  (s,t)  and  hence  p. ,  (s,t)  is  continuous  and  non-decreasing  in  both  s  and  t. 

Likewise,  p2(s,t)  =  X"=l r,(s,t)p, .where  n(s,t)  =  y, /,(/)  +  tl2 (i) + (sy f/x,. It  can 
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also  be  shown  that  r,(s,r),  and  hence  fi2(s,t) /b ,  is  continuous  and  non-decreasing  in  both 

s  and  t.  Thus  d{s,t)  is  continuous  and  non-decreasing  in  s  and  t. 

On  each  (x^a+i)]  x  CWO'+i)]’ the  ratio  n(s,t)/d(s,t )  is  thus  continuous  and  non¬ 
increasing  in  5  and  t  and  therefore  has  minimum  value  at  (xu+\),  y<j+i))-  By  a  careful 
examination  of  the  cost  function  it  can  be  shown  that  C  OiOV))  ^  C  ( X/, y(n>  +  y), 
i  =  1  . . .  n  for  y  >  0  and  C {x(n),yj)  <  C (X(„)  +  x,y}),  j  =  1  ...  n  for  x >  0.  As  such,  it  is  not 
necessary  to  search  beyond  the  outermost  point  of  the  grid,  namely  (x  („),  y\n)}-  These 
points  are  gathered  into  the  following  result. 

Theorem  6.2:  Consider  the  probability  distribution  which  places  mass/?,  on 
(Xi,yi),  i  =  1, ... ,  n.  Let  C(s,t)  be  defined  as  in  (6.15)  and  T  as  in  (6.16),  and 
z  =  argminC  (s,t).  Then,  z  e  T. 


D.  EXAMPLE 

Returning  to  the  jet  engine  and  automobile  data  sets,  Table  6.1  contains  z  for  the 

A 

cost  ratios  r  =  1 .0,  0.5,  and  0. 1  when  F  is  estimated  by  the  empirical  distribution  F  . 

A 

Beneath  z  in  each  cell  is  F  (z). 
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r=  1 

II 

O 

Lh 

r  =  0.1 

Jet  Engine 

(4932,2426) 

0.238 

(3227,1550) 

0.000 

(3227,1550) 

0.000 

Automobile 

(330,10300) 

0.578 

(368,8000) 

0.421 

(68,8400) 

0.053 

Table  6.1:  Rectangular  Policies  for  Various  Cost  Ratios. 
Parenthetical  entries  in  the  cells  represent  the  optimal  policy 
corresponding  to  a  particular  value  of  the  cost  ratio  r.  Beneath  each  such 
entry  is  the  value  of  the  empirical  distribution  at  this  point. 


We  make  the  following  observations  from  Table  6.1.  First,  as  indicated  by  the  values 

F  (£),  more  conservative  policies  are  selected  as  r  decreases  (under  more  conservative 
policies,  devices  have  a  smaller  chance  of  failure  before  replacement).  However,  the 
policies  are  not  always  nested;  in  particular,  for  the  automobile  data,  the  policy  for  r  =  0.1 
is  not  contained  in  the  policy  for  r  =  0.5.  Also,  none  of  the  £  correspond  with 
observations,  thus  amplifying  the  need  to  evaluate  the  estimated  cost  function  at  all  points 
in  the  grid  T.  Figure  6.1  depicts  the  policies  for  the  jet  engine  data.  Note  from  Table  6.1 
that  the  policy  for  r  -  0.5  is  identical  to  the  policy  for  r  =  0.1,  and  that  this  policy  is 
nested  within  the  policy  for  r  =  1 . 
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Figure  6.1 :  Rectangular  Policies  for  Jet  Engine  Data. 

The  dashed  lines  represent  the  boundaries  of  the  policies  for  r  =  0.5  and  1. 


E.  DISCUSSION  AND  SUMMARY 

In  this  chapter  we  developed  the  two-dimensional  renewal  reward  process,  and  it 
served  as  the  foundation  on  which  to  build  the  cost  function  for  policies  in  Mx under  a 
joint  model  for  (X,Y).  The  cost  function  arises  from  the  analog  of  the  univariate  Renewal 
Reward  Theorem,  and  has  dimension  cost  per  unit  of  chronological  time  in  use,  much 
like  (1.1).  In  the  latter  half  of  this  chapter,  we  derived  the  form  of  the  cost  function  for 
rectangular  policies  and  showed  how  to  find  the  rectangular  policy  with  lowest  cost  given 

a  set  of  bivariate  failure  data.  The  notions  developed  in  this  chapter  are  easily  extended 

1 

to  policies  based  on  more  than  two  scales. 


85 


We  do  not  claim  the  policy  £  produced  by  this  procedure  is  an  estimate  of  a  true 
optimal  z*  for  the  underlying  F.  Unlike  the  case  of  several  linear  paths,  we  have  yet  to 
find  examples  of  non-trivial  bivariate  distributions  for  which  an  optimal  z*  or  an 
equivalence  class  of  such  policies  exists.  The  closest  work  in  the  literature  is  that  of 
Murthy  et  al  (1995)  in  which  the  parameters  of  the  optimal  rectangular  warranty  policy 
are  found  for  certain  named  bivariate  distributions,  but  the  cost  functions  used  to  define 
“optimal”  are  very  different  in  nature  from  ours.  Perhaps  certain  bivariate  notions  of 
aging  (e.g.,  bivariate  IFR,  etc.)  can  be  used  to  identify  distributions  for  which  a  z*  exists. 
Also,  under  additional  conditions,  it  may  be  possible  to  show  that  £  converges  to  z* ■  If 
such  distributions  can  be  identified,  simulation  studies  can  be  conducted  to  verify  the 
small-sample  properties  of  £ . 
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VII.  CONCLUSIONS 


In  this  dissertation,  we  generalize  the  classical  age  replacement  policy  to  the  case 
in  which  the  age  of  a  device  is  recorded  in  more  than  one  time  scale.  We  use  several  case 
studies  to  motivate  the  form  of  a  general  replacement  policy  in  multiple  scales.  The  case 
studies  demonstrate  the  need  for  careful  consideration  in  developing  such  policies.  In  the 
first  two,  we  notice  that  in  some  situations,  simply  ignoring  the  usage  scale  may  not  be 
problematic,  but  in  others,  failure  times  in  one  single  scale  (e.g.,  chronological  time)  may 
not  capture  the  entire  damage  accumulation  process.  The  third  case  study  reveals  that  a 
naive  (though  seemingly  sensible)  approach  for  data  lying  along  linear  paths  can  result  in 
a  policy  that,  although  “optimal”  from  the  standpoint  of  (estimated)  costs,  is  not  sensible 
from  the  standpoint  of  implementation.  Based  on  these  observations,  we  describe  a  class 
of  policies  that  are  sensible  from  the  standpoint  of  implementation.  This  class 
generalizes  multiple-scale  policies  found  in  the  literature.  Furthermore,  we  find  it  is 
desirable  for  multiple-scale  policies  to  be  nested  when  considering  (in  sensitivity 
analyses,  for  example)  a  decreasing  sequence  of  cost  ratios;  otherwise,  the  replacement 
times  prescribed  by  the  policies  can  be  inconsistent  with  the  interpretation  of  the  cost 
ratios. 

When  failure  times  are  recorded  in  multiple  scales,  it  becomes  readily  apparent 
that  identical  devices  do  not  operate  under  identical  field  conditions.  Researchers  are 
grappling  with  ways  to  use  such  lifetime  data  to  produce  comprehensive  models,  and 
some  are  seeking  to  use  these  models  in  the  arena  of  optimal  preventive  maintenance. 


87 


Methods  for  developing  preventive  maintenance  policies  for  such  devices  fall  on  a 
continuum  ranging  between  two  extremes.  One  extreme,  as  noted  by  Kordonsky  and 
Gertsbakh  (1997)  is  to  provide  an  individualized  policy  for  every  single  device  in  the 
population.  They  note  such  an  approach  is  totally  impractical  and,  as  a  result, 
unacceptable.  The  other  extreme  is  the  “one-size-fits-all”  approach,  in  which  the 
“optimal”  policy  is  based  on  fitting  a  single  distribution  to  observations  which,  in  reality, 
may  come  from  a  mixture;  this  policy  is  then  applied  to  the  entire  population.  Basing  a 
policy  on  a  combined  scale  falls  in  between  these  extremes  in  that  data  in  two  scales  are 
modeled  by  a  univariate  distribution  in  some  “optimal”  scale.  As  expressed  by 
Kordonsky  and  Gertsbakh  (1997),  the  goal  of  such  approaches  is  to  find  a  scale  in  which 
maintenance  actions  can  be  described  “in  a  unified  way  which  would  fit  all  exemplars 
and  would  cover  all  operational  conditions.”  We  carefully  examine  policies  based  on 
combined  scales  arising  from  three  approaches  in  the  literature  in  light  of  “desirable” 
properties.  We  find  that  each  of  the  three  approaches  lacks  features  important  when 
developing  multiple-scale  policies.  In  one  approach,  the  observations  are  translated  into 
many  different  scales  and  the  scale  corresponding  to  the  minimum  value  of  a  “converted” 
cost  function  is  defined  to  be  “best.”  This  approach,  although  motivated  from  the 
standpoint  of  minimizing  costs,  does  not  guarantee  nested  policies  in  the  original  scales. 
In  the  second  approach,  a  combined  scale  is  found  in  a  manner  unrelated  to  maintenance 
costs.  Policies  based  on  this  scale  have  the  same  “shape”  and  are  nested.  The  third 
approach  also  restricts  the  form  of  the  policy  in  a  manner  unrelated  to  costs.  This 
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approach,  although  appropriate  in  some  preventive  maintenance  contexts,  does  not  seem 
best  suited  for  age  replacement. 

We  consider  multiple-scale  age  replacement  in  two  settings.  In  the  first,  since  it  is 
common  in  the  literature  to  approximate  unknown  usage  paths  with  straight  lines,  we 
develop  a  procedure  based  on  the  assumption  that  devices  age  along  linear  paths.  Like 
the  scale-combining  approaches,  our  approach  lies  between  the  extremes  in  that  it  can 
result  in  different  policies  for  devices  on  different  usage  paths.  However,  our  procedure 
does  not  rely  on  finding  an  “optimal”  scale.  Instead,  it  considers  the  lifetime  distributions 
corresponding  to  devices  on  different  paths  in  a  manner  resulting  in  an  estimate  of  the 
optimal  policy  among  a  class  of  “sensible”  policies.  We  show  that  under  mild  conditions, 
the  estimated  optimal  replacement  times  are  strongly  consistent  estimators  of  the  true 
optimal  replacement  times,  and  then  show  by  simulation  that  these  estimates  are  well- 
behaved  in  small-sample  situations.  It  is  also  shown  that  our  procedure  tends  to  produce 
policies  having  lower  true  cost  than  those  based  on  the  min  CV  method. 

In  the  second  setting,  device  usage  paths  are  unknown.  We  define  the  two- 
dimensional  renewal  reward  process,  and  prove  a  two-dimensional  version  of  the 
Renewal  Reward  Theorem.  Using  this  result,  we  develop  the  cost  function  by  which  we 
can  evaluate  various  policies  under  the  assumption  of  a  joint  model  for  the  bivariate 
failure  times.  We  also  derive  the  form  of  the  cost  function  for  a  smaller  class  of 
alternatives  and  present  numerical  results  obtained  from  solving  the  corresponding 
optimization  problem  for  various  two-dimensional  failure  data  sets. 
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We  note  that  our  contributions  may  seem  to  fall  in  the  area  known  as 
“multivariate  age  replacement.”  The  literature  in  this  realm,  however,  differs 
significantly  from  ours.  In  this  literature,  “multivariate  age”  refers  to  the  ages  of  several 
components,  where  age  is  measured  in  a  single  scale.  For  example,  Ebrahimi  (1997) 
defines  MAR(7j, ,  7*),  the  policy  for  multivariate  age  replacement  for  a  system  of  k 
components  which  replaces  component  i,  i  =  1, 2, ... ,  k either  at  age  T,  or  upon  its 
failure.  For  the  case  k  =  2,  Ebrahimi  explains  how  to  find  the  optimal  MAR(r,7)  for  both 
series  and  parallel  systems.  Heinrich  and  Jensen  (1996)  also  discuss  optimal  replacement 
in  a  two-component  parallel  system,  as  does  Scheaffer  (1975). 

Numerous  extensions  to  the  dissertation  research  present  themselves.  Throughout 
this  dissertation  our  main  focus  has  been  on  data  consisting  of  ordered  pairs  representing 
the  chronological  age  at  failure  and  the  cumulative  usage  at  failure.  In  some  cases  (e.g., 
the  aircraft  wing  joint  we  mention  in  the  Introduction)  more  than  one  measure  of  usage 
may  be  available;  in  other  cases,  values  of  other  external  covariates  thought  to  impact  the 
failure  process  may  be  available.  The  concept  of  a  lower  set  generalizes  to  higher 
dimensions,  and  the  problem  of  incorporating  additional  external  covariates  into  policy 
estimation  is  worthy  of  consideration.  In  fact,  as  noted  in  the  Introduction,  the  definition 
of  time  scale  is  general  enough  to  include  such  cases.  In  the  single-scale  realm,  Love  and 
Guo  (1991)  and  Kumar  and  Westberg  (1997)  present  methods  for  obtaining  age 
replacement  policies  for  a  pressure  gauge  given  covariate  information  (the  data  set  can  be 
found  in  Appendix  B).  Both  of  these  use  a  parametric  model  to  incorporate  the  effect  of 
the  co variate  on  gauge  lifetime.  The  work  of  Makis  and  Jardine  (1992,  1999)  in  the 
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single-scale  realm  is  more  comprehensive.  They  recommend  a  combination  of  age 
replacement  and  “condition-based”  replacement  in  hopes  of  obtaining  replacement 
decisions  that  are  more  accurate  than  by  employing  one  approach  or  the  other.  The 
foundation  of  their  work  is  the  Cox  proportional  hazard  model  (PHM)  with  time- 
dependent  covariates.  Given  a  data  set  of  the  form  considered  in  this  dissertation,  we  can 
obtain  (in  concept)  a  multiple-scale  replacement  policy  by  treating  the  measurements 
from  the  second  time  scale  as  the  time-dependent  covariate.  Duchesne  (1999),  however, 
remarks  that  “because  models  with  covariates  treat  the  time  variable  and  the  covariates 
quite  asymmetrically,  it  is  not  recommended  to  choose  an  arbitrary  scale  as  the  main 
scale  and  the  other  scale  as  covariates.”  Farewell  and  Cox  (1979)  issue  a  similar 
warning.  Of  course,  one  can  conceive  of  a  situation  where  a  wealth  of  information  is 
available  at  device  failure,  including  time  in  various  scales  and  numerous  condition 
measurements  (some  of  which  may  be  interval  covariates  such  as  measures  of  wear).  In 
such  cases,  we  echo  Duchesne’s  (1999)  call  for  methods  for  the  systematic  identification 
of  information  categories  for  inclusion  in  models  for  device  failure. 

The  procedure  developed  in  Chapter  IV  relies  on  the  assumption  that  for  a  given 
r  >  0,  the  collection  of  conditional  distributions  {F,  }  has  unique  and  finite 
(Ti*,T2 6  A.  Further  investigation  is  needed  to  characterize  families  with  this 
property.  This  would  provide  a  means  for  checking  model  assumptions  before  applying 
the  procedure.  We  note  that  stochastic  ordering  (or  even  the  stronger  failure  rate 
ordering)  of  the  conditional  lifetimes  is  not  sufficient  to  guarantee  (Ti*,  T2*,...,  rm*)  e  A. 
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In  addition,  numerous  extensions  were  made  to  the  basic  problem  with  cost 
function  (1.1)  in  the  years  following  its  initial  development,  as  noted  in  the  surveys  by 
McCall  (1965),  Pierskalla  and  Voelker  (1976),  and  Valdez-Flores  and  Feldman  (1989). 
Such  extensions  as  cost  discounting,  imperfect  repair,  and  others  are  also  viable  research 
topics  for  the  multiple-scale  problem. 

The  cost  function  (4.2)  by  which  we  define  the  “optimal”  composite  policy  is  of 
the  “average  of  cost  functions”  form  considered  by  Gertsbakh  and  Kordonsky  (1997). 
Letting  R  denote  the  “reward”  (cost)  of  a  replacement  and  U  the  replacement  time,  the 
estimation  of  the  optimal  policy  based  on  a  “true”  reward  functional  of  the  form 
E[R]IE[U]  for  the  linear  path  case  would  also  be  a  worthwhile  pursuit.  Here  £17?]  and 
E[U]  could  be  found  by  a  conditioning  approach  (e.g.,  Ross,  1997).  This  function  has  a 
slightly  different  interpretation  than  the  one  in  (4.2),  and  is  closely  related  to  (6.3). 

Finally,  we  note  much  can  be  built  on  the  foundation  created  in  Chapter  VI,  where 
we  focus  on  non-parametric  policy  estimation  for  the  case  in  which  observations  do  not 
fall  on  linear  paths.  For  example,  we  concentrate  specifically  on  rectangles.  While  such 
policies  are  easily  implemented,  it  is  conceivable  that  other  members  of  Mx  may  result  in 
lower  cost  than  the  “best”  rectangular  policy  (if  it  exists)  for  a  given  F.  For  instance,  for 
some  F,  the  class  of  policies  bounded  by  the  quantile  curves  of  F  may  be  worthy  of 
consideration.  Under  such  a  policy  (much  like  the  policies  based  on  an  ideal  time  scale) 
the  probability  of  failure  before  replacement  would  be  identical  for  devices  on  any  usage 
path.  However,  implementation  may  be  difficult  due  to  the  shape  of  such  a  policy.  It 
may  also  be  fruitful  to  consider  clustering  methods  for  the  case  of  unknown  usage  paths. 
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In  such  an  approach,  observations  (X,Y)  could  be  clustered  by  their  (estimated)  usage  rate 
Y/X  and  then  projected  onto  the  line  with  slope  corresponding  to  their  respective  cluster 
center.  With  the  data  in  this  form,  the  techniques  of  Chapter  IV  could  then  be  applied  to 
the  “projected”  data.  A  similar  approach  was  suggested  by  Duchesne  (1999)  for  non- 
parametric  estimation  of  the  ITS. 
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APPENDIX  A:  RENEWAL  THEORETIC  DEFINITIONS  AND  DERIVATION  OF 

COST  FUNCTION 

A.  DEFINITIONS 

The  following  renewal  theoretic  definitions  are  from  Ross  (1997).  A  stochastic 
process  { N(t)',  t  >  0}  is  a  counting  process  if  N(t)  represents  the  total  number  of  events 
that  have  occurred  up  to  time  t.  Let  {N(t)\  t>  0}  be  a  counting  process  and  let  Xn  denote 
the  time  between  the  ( n  -l)sl  and  nth  event  of  this  process,  n  >  1  (henceforth  these  times 
will  be  called  “inter-renewal  times”).  If  the  inter-renewal  times  {X,,}  are  independent  and 
identically  distributed  (iid),  the  counting  process  {N(t)-,  t>  0}  is  a  renewal  process ;  a 
“renewal”  has  taken  place  when  an  event  has  occurred.  Given  a  renewal  process 
{N{t)\  t  >  0}  with  inter-renewal  times  {X„},  let  Rn  denote  the  reward  earned  at  the  time  of 

the  n,h  renewal.  Assume  the  Rn,  n  >  1  are  iid;  Rn  may  depend  on  Xn.  Let  Z(t )  = 

represent  the  total  reward  earned  up  to  time  t;  [Z(t);  t  >  0}  is  a  renewal  reward  process. 

B.  DERIVATION  OF  SINGLE-SCALE  COST  FUNCTION 

Consider  a  device  which  is  maintained  under  an  age  replacement  policy;  that  is, 
the  device  is  replaced  upon  failure  or  when  it  reaches  age  T,  whichever  comes  first 
(assume  the  replacement  time  is  negligible).  For  example,  consider  a  large  supply  of 
identical  light  bulbs.  Upon  failure,  a  light  bulb  is  replaced  instantly;  operating  conditions 
remain  identical  from  one  light  bulb  to  the  next.  Assume  replacement  devices  are  as 
good  as  new.  Let  Xn  denote  the  lifetime  of  the  nth  device;  assume  Xu  X2, ...  are  iid  with 
distribution  function  F  and  survivor  function  S.  For  simplicity,  assume  F  is  absolutely 
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continuous  with  density/;  Nakagawa  and  Osaki  (1977)  discuss  the  discrete  version  of  this 
problem.  Let  Un  =  min  {Xn,r}  denote  the  time  between  the  (n  -l)st  and  nth  replacement; 
assume  a  replacement  has  occurred  at  time  0.  Let  N(t)  denote  the  number  of  replacements 
to  occur  in  (0,  /];  by  the  assumptions  made  thus  far  { N(t );  t  >  0}  is  a  counting  process 
with  times  between  events  iid  and  is  therefore  a  renewal  process.  Suppose  the  cost  for 
replacement  is  K  >  0  if  replaced  due  to  age  (i.e.,  preventively)  and  (K  +  Q  if  replaced  due 
to  failure  (assume  C  >  0;  this  indicates  the  costly  nature  of  a  replacement  during 
operation).  Let  Z(t )  denote  the  total  cost  incurred  in  (0,  f];  {Z(t);  t  >  0}  is  a  renewal 
reward  process  with  inter-renewal  times  {(/„},  where  Un  =  min{X„,r}, 

Rn=K  +  CI[Xn  <  r] ,  and  Z(t)  =  £"('\ Rn  .  Ross  (1997)  proves  that  if  £[Ri]  <  ~and 

E[U{\  <  the  long-run  average  cost  per  unit  of  time  in  use  is  lim  Z(t)/t  =  ZifR,  ]/£[(/ ,  ] 

/  — 

with  probability  1.  If  we  say  a  “cycle”  is  completed  every  time  a  replacement  occurs,  this 
limit  is  the  “expected  reward  per  cycle”  over  the  “expected  cycle  length.”  We  now 
compute  £[/?i]  and  E[U\\.  Since  Rx  =  K  +  C7[X,  <  t]  ,  we  find  ^Ri]  =  K  +  C  F(f). 

Since  (/, |X,  =  XxI{Xx  <  r)  +  tI(Xx  >  t),  we  find  E\Ux]=^tf{t)dt  +  JT t f{t)dt ,  which 

reduces  to  J  S(u)du  .  Thus,  the  long-run  average  cost  per  unit  of  time  in  use  as  a 
function  of  fis  (1.1). 

C.  SINGLE-SCALE  COST  FUNCTION  AND  SCALE  FAMILIES 

The  following  lemma  shows  that  (1.1)  behaves  “as  expected”  in  scale  families. 
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Lemma  A.l  (Optimal  Replacement  Time  Ordering  in  Scale  Families):  Let  Z 

and  Y  denote  lifetimes  from  distributions  Fz  and  Fy,  respectively,  where  Z~  aY ,  with 

a  >  0.  Let  K  and  C  >  0.  Let  Tz*  and  Ty  minimize  (1.1)  when  F  =  Fz  and  Fy,  respectively. 

$  * 

Then,  Tz  =  aTy  . 

Proof:  Let  T>  0.  Then,  by  definition 

cz(t)=E1£IM. 

£  Sz(u)du 


It  follows  that 


/f  +  CFy(r/a) 
z|£/a  SY(u  /a)(l/a)  JmJ 


a 


K  +  CFy(j/a) 
^aSY{u)du 


=  -CY(r/a). 
a 


But  then 

tz*  =  arg  min  Cz  (t) 

=  arg  min — CY  (r/a ) 
a 

=  arg  min  Cy{x/a) 

* 

—  Cl  Ty  * 

where  the  last  two  equalities  follow  by  observing  that  (1)  minima  are  preserved  under 
vertical  shrinking,  and  (2)  minima  are  scaled  upon  horizontal  stretching. 
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APPENDIX  B:  DATASETS 


1.  Automobile  data.  This  data  set  consists  of  19  failure  times  in  days  since  purchase  and 
number  of  miles  driven  (to  the  nearest  100  miles)  for  a  particular  automobile  component. 
The  data  set  is  taken  from  Wilson  (1993,  p.  32).  The  data  are  presented  in  the  table 
below. 


Failure  j  Days 

Miles 

i 

146 

3200 

2 

251 

11100 

3 

251 

11100 

4 

470 

14100 

5 

26 

8400 

6 

330 

8500 

7 

r- 

00 

6800 

8 

210 

9100 

9 

368 

6500 

10 

68 

1200 

11 

340 

11000 

12 

384 

12400 

13 

286 

8000 

14 

306 

10300 

15 

105 

1900 

16 

24 

1100 

17 

95 

2200 

18 

101 

4200 

19 

187 

2400 

Table  B.1:  Auto  Data 
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2.  Metal  fatigue  data.  This  data  set  was  discussed  in  Kordonsky  and  Gertsbakh  (1993, 
p.  240);  a  summary  of  their  description  of  the  data  set  follows.  A  sample  of  30  identical 
steel  specimens  was  divided  into  six  groups  of  size  five;  each  group  was  subjected  to  a 
cyclic  two-level  loading  regime  until  failure.  The  loading  regime  for  group  j  was  a 
periodic  sequence  of  5000  loading  cycles  consisting  of  5000 ety  cycles  of  small  amplitude 
(i.e.,  low  load)  followed  by  5000(1-^)  cycles  of  large  amplitude  (i.e.,  high  load),;  = 

1,. . .  ,6.  The  table  below  records  the  cumulative  number  of  low  cycles  and  high  cycles  at 
failure  for  each  specimen,  scaled  by  a  factor  of  10. 


Specimen 

OCj 

Low/ 10 

High/10 

Specimen 

OCj 

Low/ 10 

High/10 

’  I  ’ 

0.95 

25680 

1350 

16  1 

0.40 

3200 

4570 

2 

KHH 

1160 

17 

3 

Egg 

1925 

18 

gga 

4 

Esa 

mmm 

1750 

19 

mi 

4200 

6060 

5 

1 

2000 

0.40 

5400 

8040 

6 

0.80 

■RliCT 

■ 

21 

■SECT 

■HI 

7 

0.80 

■IrAVCT 

CTilliltCT 

I 

22 

■SECT 

mmi 

8 

0.80 

■ 

23 

ESI 

HSU 

9 

0.80 

15600 

■ 

24 

0.20 

1900 

7260 

10 

■SECT 

■ 

25 

0.20 

1100 

4200 

11 

EEJ 

■Eiiia 

■ 

26 

0.05 

300 

5390 

12 

■SECT 

| 

27 

375 

6855 

13  j 

0.60 

28 

425 

7795 

14 

■SECT 

mmm 

29 

■SSH 

332 

5795 

15 

mi 

1 

msm 

30 

isygj 

275 

5125 

Table  B.2:  Metal  Data 
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3.  Traction  motor  data.  This  data  set  comes  from  the  railroad  industry,  and  is  found  in 
Wilson  (1993,  p.  31).  Table  B.3  contains  the  time  since  inception  of  service  and  mileage 
at  failure  of  forty  locomotive  traction  motors  when  they  were  returned  to  the  depot  for 
maintenance. 


i 

miles 

days 

i 

miles 

days 

i 

9766 

166 

21 

5922 

128 

2 

2041 

35 

22 

1974 

31 

3 

12392 

249 

23 

2030 

65 

4 

9889 

190 

24 

12532 

221 

5 

974 

27 

25 

14796 

316 

6 

1594 

41 

26 

979 

22 

7 

2128 

59 

27 

15062 

261 

8 

2158 

75 

28 

2062 

32 

9 

11187 

223 

29 

16888 

397 

10 

47660 

952 

30 

3099 

48 

11 

13827 

335 

31 

28 

1 

12 

5992 

164 

32 

95 

27 

13 

6925 

145 

33 

12600 

295 

14 

7078 

170 

34 

8067 

140 

15 

7553 

140 

35 

41425 

827 

16 

25014 

498 

36 

105 

2 

17 

25380 

571 

37 

12302 

209 

18 

26433 

499 

38 

447 

29 

19 

16494 

340 

39 

9766 

166 

20 

7162 

160 

40 

57304 

1200 

Table  B.3:  Traction  Motor  Data 
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4.  Jet  engine  failure  data.  This  data  set  is  discussed  in  Gertsbakh  and  Kordonsky  (1998, 
p.  1186)  and  was  obtained  from  the  first  author.  Table  B.4  contains  the  flight  hours  and 
number  of  landings  at  failure  of  21  jet  engines. 


Table  B.4:  Jet  Engine  Data 


5.  Pressure  gauge  data.  The  table  below  contains  the  failure  (or  censoring,  if  marked  by 
an  asterisk)  time  in  hours  of  15  pressure  gauges  and  the  corresponding  covariate  value 
“pressure.”  The  data  set  is  from  Love  and  Guo  (1991,  p.  14).  The  implication  is  that  the 
value  of  the  covariate  was  fixed  during  each  particular  life  cycle.  Thus,  for  example,  the 
first  entry  indicates  that  “medium”  (in  some  sense)  pressures  were  measured  from  time  0 
until  failure  at  70  hours. 


i 

Time  (hrs) 

Pressure 

i 

70 

4 

2 

53 

4 

3 

77 

4 

4 

42 

4 

5 

61* 

4 

6 

51 

5 

7 

70 

5 

8 

32 

5 

9 

47 

5 

10 

44* 

5 

11 

101 

3 

12 

66 

3 

13 

198 

3 

14 

95 

3 

15 

60* 

3 

Table  B.5:  Pressure  Gauge  Data 
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